VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper • 2502.02492 • Published 7 days ago • 49
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning Paper • 2309.02591 • Published Sep 5, 2023 • 15
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation Paper • 2305.01569 • Published May 2, 2023 • 2
Make-A-Video: Text-to-Video Generation without Text-Video Data Paper • 2209.14792 • Published Sep 29, 2022