Stas Bekman

stas

AI & ML interests

Toolmaker. Software creator, optimizer and harmonizer. Makes things work and fly at Contextual.AI Training LLM/RAG/Generative AI/Machine Learning/Scalability

Recent Activity

Organizations

BigScience Workshop's profile picture Social Post Explorers's profile picture

Posts 7

view post
Post
1211
If you remember my work on MAMF - to find the realistic TFLOPS achievable ceiling - the Intel AI team has shared their measurements and they scored ...

an incredible 99.4% TFLOPS efficiency for Gaudi 2!

That's quite amazing! Your ROI on these accelerators will be very high.

The full table is here: https://github.com/stas00/ml-engineering/tree/master/compute/accelerator#maximum-achievable-matmul-flops-comparison-table

As we have seen the competitors get their achievable efficiency worse with each new generation, I'm looking forward to see if Gaudi 3 will keep the high bar!

Thanks to Avi Rubin, Lakshman Chari, Imtiaz Sajwani, Ramy J and Zhiqi Tao for helping to get these numbers to the community.

Articles 6

Article
45

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate