Generate depth maps from images and videos
Generates a sound effect that matches video shot
Generate detailed prompts for Stable Diffusion
Generate depth map from images