VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper • 2502.02492 • Published 2 days ago • 37
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper • 2502.02492 • Published 2 days ago • 37 • 2
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers Paper • 2103.15679 • Published Mar 29, 2021
Transformer Interpretability Beyond Attention Visualization Paper • 2012.09838 • Published Dec 17, 2020
Still-Moving: Customized Video Generation without Customized Video Data Paper • 2407.08674 • Published Jul 11, 2024 • 12
Lumiere: A Space-Time Diffusion Model for Video Generation Paper • 2401.12945 • Published Jan 23, 2024 • 85
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models Paper • 2301.13826 • Published Jan 31, 2023 • 1
Running 543 543 Talking Face Generation with Multilingual TTS 👄 Generate a talking face video from text