view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • 8 days ago • 23
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 1 day ago • 149
Qwen2.5 Collection The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12, 2024 • 7
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 10 days ago • 322
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 73
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • 14 days ago • 59
InternVL2.5-MPO Collection Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 8 days ago • 26
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others • 17 days ago • 30
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference 22 days ago • 63
view article Article MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era By MiniMax-AI • 22 days ago • 40
view article Article Announcing NVIDIA Cosmos World Foundation Models By mingyuliutw and 1 other • about 1 month ago • 23