Submitted by jph00 126 Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference · 14 authors 10
Submitted by yuexiang96 50 TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks · 21 authors 2
Submitted by CodexXiang 41 No More Adam: Learning Rate Scaling at Initialization is All You Need · 4 authors 2
Submitted by ShushengYang 24 Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces · 6 authors 2
Submitted by pengxiang 19 Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN · 3 authors 2
Submitted by guozonghao96 18 LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer · 12 authors 2
Submitted by hpouransari 13 FastVLM: Efficient Vision Encoding for Vision Language Models · 11 authors 2
Submitted by bykang 12 Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation · 10 authors 4
Submitted by g-astruc 11 AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities · 4 authors 2
Submitted by mbreuss 11 Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning · 4 authors 2
Submitted by OliverZhao 10 Learning from Massive Human Videos for Universal Humanoid Pose Control · 10 authors 2
Submitted by jinzhuoran 9 RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment · 7 authors 2
Submitted by lhhuang 8 ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers · 10 authors 2
Submitted by yeungchenwa0106 4 Predicting the Original Appearance of Damaged Historical Documents · 6 authors 2
Submitted by bobxwu 4 AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge · 10 authors 2