-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2411.02394
-
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Paper • 2410.10306 • Published • 54 -
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 70 -
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
Paper • 2411.04709 • Published • 25 -
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Paper • 2410.07171 • Published • 42
-
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models
Paper • 2405.16537 • Published • 16 -
ReVideo: Remake a Video with Motion and Content Control
Paper • 2405.13865 • Published • 23 -
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models
Paper • 2406.16863 • Published • 11 -
Portrait Video Editing Empowered by Multimodal Generative Priors
Paper • 2409.13591 • Published • 16
-
GFlow: Recovering 4D World from Monocular Video
Paper • 2405.18426 • Published • 15 -
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
Paper • 2405.18424 • Published • 7 -
HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting
Paper • 2405.15125 • Published • 6 -
PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting
Paper • 2405.19957 • Published • 10
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 22 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 28 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 130 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 36
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 26 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 13 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 41 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 22