view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other β’ 14 days ago β’ 59
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others β’ 17 days ago β’ 30
Jan 17 Releases βοΈ Collection Models and datasets of the second week of Jan 2025. β’ 23 items β’ Updated 20 days ago β’ 10
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference 22 days ago β’ 63
view article Article Announcing NVIDIA Cosmos World Foundation Models By mingyuliutw and 1 other β’ about 1 month ago β’ 23
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi β’ 13 items β’ Updated Sep 18, 2024 β’ 226
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers Jul 30, 2024 β’ 63
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper β’ 2408.14906 β’ Published Aug 27, 2024 β’ 140
NIM Serverless Inference API Collection Models in this collection are available for inference via a serverless API powered by NVIDIA NIM. β’ 8 items β’ Updated 20 days ago β’ 23
view article Article π₯ Argilla 2.0: the data-centric tool for AI makers π€ By dvilasuero β’ Jul 30, 2024 β’ 37