Submitted by akhaliq 30 Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory · 4 authors
Submitted by akhaliq 24 Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning · 8 authors
Submitted by akhaliq 22 Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding · 45 authors 2
Submitted by akhaliq 17 Understanding the performance gap between online and offline alignment algorithms · 11 authors
Submitted by akhaliq 15 Compositional Text-to-Image Generation with Dense Blob Representations · 6 authors 1
Submitted by akhaliq 14 No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding · 5 authors
Submitted by akhaliq 11 SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models · 14 authors