Issue with Flash attention while running Janus-Pro-1B model locally on Mac (Solved)

#8
by saurabhksa1 - opened

I was facing the issue on my mac (metal, M2) while running the sample code given by Deepskee on generation_inference.py. I was constantly getting error

ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.

I tried many ways including passing variables, setting environments but of no use. finally the issue resolved by removing the following parameter

"_attn_implementation": "flash_attention_2"

from config.json and it worked.

Just posting here in case someone else faces this issue and may able to save some time in troubleshooting

Sign up or log in to comment