Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation Paper • 2408.15239 • Published Aug 27, 2024 • 29
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models Paper • 2408.02231 • Published Aug 5, 2024 • 2
Still-Moving: Customized Video Generation without Customized Video Data Paper • 2407.08674 • Published Jul 11, 2024 • 12
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference Paper • 2406.06424 • Published Jun 10, 2024 • 13
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Paper • 2404.12387 • Published Apr 18, 2024 • 39
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation Paper • 2404.08540 • Published Apr 12, 2024 • 11
LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models Paper • 2404.03118 • Published Apr 3, 2024 • 24
Getting it Right: Improving Spatial Consistency in Text-to-Image Models Paper • 2404.01197 • Published Apr 1, 2024 • 31
The SPRIGHT T2I collection Collection This collection contains the datasets, model, paper, and demo associated with the SPRIGHT (SPatially RIGHT) release. • 5 items • Updated Apr 2, 2024 • 6