new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Dec 5

Submitted by

osanseviero

PaliGemma 2: A Family of Versatile VLMs for Transfer

·
18 authors

Submitted by

viettmab

SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance

·
7 authors

Submitted by

leo1117

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

·
10 authors

Submitted by

jingtan

Imagine360: Immersive 360 Video Generation from Perspective Anchor

·
7 authors

Submitted by

SYZhang0805

Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion

·
10 authors

Submitted by

KangsanKim71

VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding

·
5 authors

Submitted by

kimyoungjune

VARCO-VISION: Expanding Frontiers in Korean Vision-Language Models

·
4 authors

Submitted by

xiangjun-xj

One Shot, One Talk: Whole-body Talking Avatar from a Single Image

·
6 authors

Submitted by

ChenDY

NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

·
4 authors

Submitted by

ZyZcuhk

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images

·
10 authors

Submitted by

zd11024

Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding

·
3 authors

Submitted by

cogwheelhead

U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs

·
7 authors

Submitted by

huanngzh

MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation

·
10 authors

Submitted by

stefan-baumann

CleanDIFT: Diffusion Features without Noise

·
5 authors

Submitted by

Dahoas

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

·
20 authors

Submitted by

BiaoGong

Mimir: Improving Video Diffusion Models for Precise Text Understanding

·
9 authors

Submitted by

wjpoom

Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning

·
10 authors

Submitted by

Wanfq

Weighted-Reward Preference Optimization for Implicit Model Fusion

·
5 authors

Submitted by

xyxingx

LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

·
5 authors