Papers
arxiv:2501.17811

Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling

Published on Jan 29
Authors:
,
,
,
,
,

Abstract

In this work, we introduce Janus-Pro, an advanced version of the previous work Janus. Specifically, Janus-Pro incorporates (1) an optimized training strategy, (2) expanded training data, and (3) scaling to larger model size. With these improvements, Janus-Pro achieves significant advancements in both multimodal understanding and text-to-image instruction-following capabilities, while also enhancing the stability of text-to-image generation. We hope this work will inspire further exploration in the field. Code and models are publicly available.

Community

Sign up or log in to comment

Models citing this paper 4

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2501.17811 in a dataset README.md to link it from this page.

Spaces citing this paper 44

Collections including this paper 1