Submitted by akhaliq 15 Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning · 27 authors 1
Submitted by akhaliq 12 Matcha-TTS: A fast TTS architecture with conditional flow matching · 5 authors
Submitted by akhaliq 9 Physically Grounded Vision-Language Models for Robotic Manipulation · 8 authors 1
Submitted by akhaliq 7 Bayes' Rays: Uncertainty Quantification for Neural Radiance Fields · 5 authors