Submitted by phython96 49 ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting · 7 authors 6
Submitted by akhaliq 23 FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality · 7 authors 2
Submitted by yuexiang96 23 Teach Multimodal LLMs to Comprehend Electrocardiographic Images · 4 authors 2
Submitted by ldwang 19 Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data · 19 authors 2
Submitted by Sreyan88 19 MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark · 9 authors 2
Submitted by CCCCRS 15 Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design · 7 authors 2
Submitted by omer6nahum 15 Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance · 5 authors 2
Submitted by Wyattz23 11 Counting Ability of Large Language Models and Impact of Tokenization · 3 authors 2
Submitted by ljvmiranda921 11 Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback · 9 authors 2
Submitted by yujianll 10 Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning · 4 authors 2
Submitted by yuzhaouoe 7 Analysing the Residual Stream of Language Models Under Knowledge Conflicts · 9 authors 2
Submitted by Mingtongz 6 Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling · 3 authors 2
Submitted by sergioburdisso 5 Mapping the Media Landscape: Predicting Factual Reporting and Political Bias Through Web Interactions · 4 authors 2
Submitted by Ksgk-fy 4 Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration · 4 authors 2