![](https://cdn-avatars.huggingface.co/v1/production/uploads/651e96991b97c9f33d26bde6/e4VK7uW5sTeCYupD0s_ob.png)
HuggingFaceTB/SmolVLM-Instruct
Image-Text-to-Text
β’
Updated
β’
92.8k
β’
368
Generate speech from text using selected language and speaker
Generate text responses using images and text prompts
A community project to create an image preferences dataset.
Generate clickable coordinates on a screenshot