Engage in multi-modal conversations with images and videos
Answer questions about images
Generate images from text