VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper • 2502.02492 • Published 2 days ago • 37
Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations Paper • 2501.19066 • Published 6 days ago • 9
MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation Paper • 2502.01572 • Published 3 days ago • 18
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published 4 days ago • 152
MatAnyone: Stable Video Matting with Consistent Memory Propagation Paper • 2501.14677 • Published 13 days ago • 26
TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space Paper • 2501.12224 • Published 16 days ago • 46
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise Paper • 2501.08331 • Published 23 days ago • 20
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 15 days ago • 79
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 15 days ago • 301
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Paper • 2501.09781 • Published 21 days ago • 24
Do generative video models learn physical principles from watching videos? Paper • 2501.09038 • Published 23 days ago • 31
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 21 days ago • 67
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces Paper • 2501.09756 • Published 21 days ago • 19
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation Paper • 2501.09433 • Published 21 days ago • 18
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 23 days ago • 53
Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography Paper • 2501.08970 • Published 22 days ago • 6
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published 22 days ago • 15
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities Paper • 2501.08983 • Published 22 days ago • 20