Running on A10G 240 240 TTS x Hallo Talking Portrait 👋 Generate Talking avatars from Text-to-Speech
EVLM: An Efficient Vision-Language Model for Visual Understanding Paper • 2407.14177 • Published Jul 19, 2024 • 43