ACECODER: Acing Coder RL via Automated Test-Case Synthesis Paper • 2502.01718 • Published 11 days ago • 23
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models Paper • 2502.01639 • Published 11 days ago • 24
DeepFlow: Serverless Large Language Model Serving at Scale Paper • 2501.14417 • Published 21 days ago • 2
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Paper • 2501.16764 • Published 17 days ago • 21
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning Paper • 2411.04983 • Published Nov 7, 2024 • 11
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published 15 days ago • 52
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published Dec 10, 2024 • 45
VILA-U-7B Collection VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation • 2 items • Updated Jan 13 • 5
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action Paper • 2312.17172 • Published Dec 28, 2023 • 28