Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
shail-2512
's Collections
MultiModal (Any-to-Any)
ALMs (Audio Language Models)
LLMs
TTS
Coder
Reasoning (LRMs)
Image Generation
VLMs
3D
Video Generation
Speech Recognition
Dataset to fine-tune Embeddings
Reranking Models
Embedding Models
VLMs
updated
Dec 2, 2024
Upvote
-
HuggingFaceTB/SmolVLM-Instruct
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
121k
•
376
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
2.13k
•
1.55k
vidore/colsmolvlm-v0.1
Visual Document Retrieval
•
Updated
2 days ago
•
43.9k
•
46
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text
•
Updated
Dec 4, 2024
•
1.7M
•
•
1.31k
Qwen/Qwen2-VL-7B-Instruct
Image-Text-to-Text
•
Updated
8 days ago
•
1.65M
•
1.12k
mistral-community/pixtral-12b
Image-Text-to-Text
•
Updated
18 days ago
•
37.5k
•
86
HuggingFaceM4/Idefics3-8B-Llama3
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
51.5k
•
265
allenai/Molmo-7B-O-0924
Image-Text-to-Text
•
Updated
Nov 15, 2024
•
6.16k
•
153
Upvote
-
Share collection
View history
Collection guide
Browse collections