Submitted by MingZhong 54 Law of the Weakest Link: Cross Capabilities of Large Language Models · 17 authors 2
Submitted by akhaliq 30 TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices · 4 authors 6
Submitted by guokan-shang 25 Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect · 12 authors 2
Submitted by ZechenBai 19 One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos · 9 authors 3
Submitted by akhaliq 19 Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation · 5 authors 5
Submitted by lilelife 11 SyntheOcc: Synthesize Geometric-Controlled Street View Images through 3D Semantic MPIs · 7 authors 2
Submitted by akhaliq 11 ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer · 8 authors 2
Submitted by hcwei 10 Visual Context Window Extension: A New Perspective for Long Video Understanding · 2 authors 2
Submitted by ohayonguy 10 Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration · 3 authors 3
Submitted by gengshan-y 8 DressRecon: Freeform 4D Human Reconstruction from Monocular Video · 5 authors 2
Submitted by akhaliq 7 Helpful DoggyBot: Open-World Object Fetching using Legged Robots and Vision-Language Models · 5 authors 2
Submitted by BSavoldi 5 What the Harm? Quantifying the Tangible Impact of Gender Bias in Machine Translation with a Human-centered Study · 5 authors 2
Submitted by Quanting 3 Embodied-RAG: General non-parametric Embodied Memory for Retrieval and Generation · 7 authors 2