--- license: gpl-3.0 datasets: - nkp37/OpenVid-1M - TempoFunk/webvid-10M base_model: - VideoCrafter/VideoCrafter2 pipeline_tag: text-to-video --- # Advanced text-to-video Diffusion Models ⚡️ This repository provides training recipes for the AMD efficient text-to-video models, which are designed for high performance and efficiency. The training process includes two key steps: * Distillation and Pruning: We distill and prune the popular text-to-video model [VideoCrafter2](https://github.com/AILab-CVC/VideoCrafter), reducing the parameters to a compact 945M while maintaining competitive performance. * Optimization with T2V-Turbo: We apply the [T2V-Turbo](https://github.com/Ji4chenLi/t2v-turbo) method on the distilled model to reduce inference steps and further enhance model quality. This implementation is released to promote further research and innovation in the field of efficient text-to-video generation, optimized for AMD Instinct accelerators.  **8-Steps Results**
A cute happy Corgi playing in park, sunset, pixel. | A cute happy Corgi playing in park, sunset, animated style.gif | A cute raccoon playing guitar in the beach. | A cute raccoon playing guitar in the forest. |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
A quiet beach at dawn and the waves gently lapping. | A cute teddy bear, dressed in a red silk outfit, stands in a vibrant street, Chinese New Year. | A sandcastle being eroded by the incoming tide. | An astronaut flying in space, in cyberpunk style. |
![]() |
![]() |
![]() |
![]() |
A cat DJ at a party. | A 3D model of a 1800s victorian house. | A drone flying over a snowy forest. | A ghost ship navigating through a sea under a moon. |
![]() |
![]() |
![]() |
![]() |