![](https://cdn-avatars.huggingface.co/v1/production/uploads/1613114437487-60262a8e0703121c822a80b6.png)
nvidia/NVLM-D-72B
Image-Text-to-Text
•
Updated
•
47.6k
•
765
A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks.