new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Dec 17

Submitted by

artidoro

Byte Latent Transformer: Patches Scale Better Than Tokens

·
14 authors

Submitted by

Ziqi

Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models

·
5 authors

Submitted by

ZyZcuhk

BrushEdit: All-In-One Image Inpainting and Editing

·
6 authors

Submitted by

dongguanting

RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

·
7 authors

Submitted by

dongguanting

Smaller Language Models Are Better Instruction Evolvers

·
6 authors

Submitted by

ZyZcuhk

ColorFlow: Retrieval-Augmented Image Sequence Colorization

·
7 authors

Submitted by

Andy1621

Causal Diffusion Transformers for Generative Modeling

·
5 authors

Submitted by

CCCCCC

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

·
10 authors

Submitted by

jlcao2

Wonderland: Navigating 3D Scenes from a Single Image

·
9 authors

Submitted by

Xxlbigbrother

GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs

·
11 authors

Submitted by

deepcs233

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

·
10 authors

Submitted by

lizb6626

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

·
6 authors

Submitted by

XiaokunSun

StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors

·
5 authors

Submitted by

shihan96

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

·
10 authors

Submitted by

emrys-hong

Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning

·
7 authors

Submitted by

osanseviero

The Open Source Advantage in Large Language Models (LLMs)

·
4 authors

Submitted by

ozbro

SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video

·
6 authors

Submitted by

BrandonLiu

DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes

·
4 authors

Submitted by

JingzeShi

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

·
2 authors

Submitted by

jimmyyhwu

TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning

·
9 authors

Submitted by

thuhsy

MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes

·
8 authors

Submitted by

BoZhang

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training

·
15 authors

Submitted by

csferrazza

MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

·
5 authors

Submitted by

prateekv

Whisper-GPT: A Hybrid Representation Audio Large Language Model

·
1 authors

Submitted by

dustalov

Reliable, Reproducible, and Really Fast Leaderboards with Evalica

·
1 authors

Submitted by

Andron00e

Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning

·
4 authors

Submitted by

nmhkahn

Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models

·
5 authors

Submitted by

jianlanluo

RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning

·
4 authors