Issue with Flash attention while running Janus-Pro-1B model locally on Mac (Solved)
I was facing the issue on my mac (metal, M2) while running the sample code given by Deepskee on generation_inference.py. I was constantly getting error
ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.
I tried many ways including passing variables, setting environments but of no use. finally the issue resolved by removing the following parameter
"_attn_implementation": "flash_attention_2"
from config.json and it worked.
Just posting here in case someone else faces this issue and may able to save some time in troubleshooting