Generate depth maps from images and videos
Generates a sound effect that matches video shot
Generate detailed prompts for Stable Diffusion