
Lightweight, economical multimodal video generation from reference images, videos, and audio with native audio.

Lightweight, economical video generation from a first-frame image (and optional last-frame) with native audio.

Lightweight, economical video generation from text prompts with native audio.
Join the Discord community for the latest model updates, prompts, and support.