Article 4 Is Attention Interpretable in Transformer-Based Large Language Models? Let’s Unpack the Hype