amd
/

Text-to-Video
File size: 3,874 Bytes
5564c63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34ced90
5564c63
2b1c50a
5564c63
 
 
75b4b9f
 
2c67e12
75b4b9f
 
 
 
 
02fabaf
75b4b9f
7a410ba
2c67e12
 
 
 
 
7a410ba
75b4b9f
2b1c50a
75b4b9f
7a410ba
 
2c67e12
75b4b9f
2b1c50a
 
 
 
 
 
75b4b9f
 
2b1c50a
75b4b9f
2c67e12
a58e294
9172b41
75b4b9f
 
 
2c67e12
a58e294
a91db26
 
 
75b4b9f
2c67e12
75b4b9f
2c67e12
75b4b9f
 
 
2c67e12
75b4b9f
a91db26
e921084
a91db26
75b4b9f
2c67e12
75b4b9f
a58e294
75b4b9f
 
 
2c67e12
75b4b9f
a58e294
75b4b9f
 
 
 
 
5564c63
 
 
 
2c67e12
5564c63
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
license: gpl-3.0
datasets:
- nkp37/OpenVid-1M
- TempoFunk/webvid-10M
base_model:
- VideoCrafter/VideoCrafter2
pipeline_tag: text-to-video
---
# Advanced text-to-video Diffusion Models


⚡️ This repository provides training recipes for the AMD efficient text-to-video models, which are designed for high performance and efficiency. The training process includes two key steps:

* Distillation and Pruning: We distill and prune the popular text-to-video model [VideoCrafter2](https://github.com/AILab-CVC/VideoCrafter), reducing the parameters to a compact 945M while maintaining competitive performance.

* Optimization with T2V-Turbo: We apply the [T2V-Turbo](https://github.com/Ji4chenLi/t2v-turbo) method on the distilled model to reduce inference steps and further enhance model quality.

This implementation is released to promote further research and innovation in the field of efficient text-to-video generation, optimized for AMD Instinct accelerators.

You can download the code from our [GitHub Repo](https://github.com/AMD-AIG-AIMA/AMD-Hummingbird-T2V).

<img src="GIFs/vbench.png" alt="Vbench performance" title="Vbench performance" class="vbench-img">


**8-Steps Results**
<style>
  table {
    width: auto;
    border-collapse: collapse;
  }
  th, td {
    border: 1px solid #ddd;
    text-align: center;
    padding: 0px;
    vertical-align: middle;
    width: 256px; /* 每列宽度固定 */
  }
  tr.text-row {
    height: 30px; /* 文字行高度 */
  }
  tr.image-row {
    height: 160px; /* 图片行高度 */
  }
  /* 默认表格中的图片大小 */
  img {
    width: 256px;
    height: 160px;
    object-fit: cover;
  }
  /* 只影响 vbench.png */
  .vbench-img {
    width: 785px !important;
    height: 698px !important;
    object-fit: contain; /* 让图片完整显示,不裁剪 */
  }
</style>


<table>
  <tr class="text-row">
    <th>A cute happy Corgi playing in park, sunset, pixel.</th>
    <th>A cute happy Corgi playing in park, sunset, animated style.</th>
    <th>A cute raccoon playing guitar in the beach.</th>
    <th>A cute raccoon playing guitar in the forest.</th>
  </tr>
  <tr class="image-row">
    <td><img src="GIFs/A_cute_happy_Corgi_playing_in_park,_sunset,_pixel_.gif"></td>
    <td><img src="GIFs/A cute happy Corgi playing in park, sunset, animated style.gif"></td>
    <td><img src="GIFs/A cute raccoon playing guitar in the beach.gif"></td>
    <td><img src="GIFs/A cute raccoon playing guitar in the forest.gif"></td>
  </tr>
  <tr class="text-row">
    <th>A quiet beach at dawn and the waves gently lapping.</th>
    <th>A cute teddy bear, dressed in a red silk outfit, stands in a vibrant street, Chinese New Year.</th>
    <th>A sandcastle being eroded by the incoming tide.</th>
    <th>An astronaut flying in space, in cyberpunk style.</th>
  </tr>
  <tr class="image-row">
    <td><img src="GIFs/A_quiet_beach_at_dawn_and_the_waves_gently_lapping.gif"></td>
    <td><img src="GIFs/A cute teddy bear, dressed in a red silk outfit, stands in a vibrant street, chinese new year..gif"></td>
    <td><img src="GIFs/A sandcastle being eroded by the incoming tide.gif"></td>
    <td><img src="GIFs/An astronaut flying in space, in cyberpunk style.gif"></td>
  </tr>
  <tr class="text-row">
    <th>A cat DJ at a party.</th>
    <th>A 3D model of a 1800s victorian house.</th>
    <th>A drone flying over a snowy forest.</th>
    <th>A ghost ship navigating through a sea under a moon.</th>
  </tr>
  <tr class="image-row">
    <td><img src="GIFs/A_cat_DJ_at_a_party.gif"></td>
    <td><img src="GIFs/A 3D model of a 1800s victorian house..gif"></td>
    <td><img src="GIFs/a_drone_flying_over_a_snowy_forest.gif"></td>
    <td><img src="GIFs/A_ghost_ship_navigating_through_a_sea_under_a_moon.gif"></td>
  </tr>
</table>






# License
Copyright (c) 2024 Advanced Micro Devices, Inc. All Rights Reserved.