CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers Paper • 2502.06527 • Published 4 days ago • 8
view article Article The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about... By srinivasbilla • 25 days ago • 60
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation Paper • 2502.04299 • Published 8 days ago • 14
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 18 days ago • 340
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 205
DextrAH-G: Pixels-to-Action Dexterous Arm-Hand Grasping with Geometric Fabrics Paper • 2407.02274 • Published Jul 2, 2024 • 1
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others • Dec 23, 2024 • 18
Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 9 items • Updated 22 days ago • 31
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published 24 days ago • 50
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation Paper • 2501.09433 • Published 29 days ago • 18
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces Paper • 2501.09756 • Published 29 days ago • 19
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published 30 days ago • 15
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling Paper • 2501.00574 • Published Dec 31, 2024 • 6
TACO Models Collection This collection contains the best-performing TACO models based on LLaMA-3/Qwen2 and SigLIP/CLIP. • 3 items • Updated Dec 20, 2024 • 8
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published Jan 8 • 84
TransPixar: Advancing Text-to-Video Generation with Transparency Paper • 2501.03006 • Published Jan 6 • 23