LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer Paper • 2502.01105 • Published 4 days ago • 6
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 2 days ago • 80
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published 4 days ago • 152
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published 15 days ago • 55
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 15 days ago • 301
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 21 days ago • 67
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published 23 days ago • 32
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 23 days ago • 272
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training Paper • 2412.09619 • Published Dec 12, 2024 • 23
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published 27 days ago • 66
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published Dec 31, 2024 • 41
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 139
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 26
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published Dec 10, 2024 • 50
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published Nov 28, 2024 • 33