Open and Advanced Large-Scale Video Generative Models.
Open and Advanced Large-Scale Video Generative Models.
HunyuanVideo is an advanced image-to-video generation model that can create high-quality videos from text descriptions. It features a comprehensive framework that integrates image-video joint model training and efficient infrastructure for large-scale model training and inference.
This model is trained on a spatial-temporally compressed latent space and uses a large language model for text encoding. According to professional human evaluation results, HunyuanVideo outperforms previous state-of-the-art models in terms of text alignment, motion quality, and visual quality.