Submitted by Weiyun1025 73 Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization · 11 authors 4
Submitted by akhaliq 58 Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions · 9 authors 4
Submitted by pmolchanov 41 Hymba: A Hybrid-head Architecture for Small Language Models · 13 authors 3
Submitted by akariasai 30 OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs · 25 authors 2
Submitted by THUdyh 23 Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models · 7 authors 2
Submitted by EvanTHU 13 DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding · 20 authors 2
Submitted by flymin 11 MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control · 6 authors 2
Submitted by gsarti 11 Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models · 4 authors 3
Submitted by lyxun 9 Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation · 13 authors 2
Submitted by davanstrien 7 UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages · 3 authors 2