No description provided.
Beijing Academy of Artificial Intelligence org

Hi, @GabrielSalem

Your proposed modification to the number of latents is not correct.
This released model is trained for the maximum 9 latents, those latents will be decoded into 33 frames.
We provide the decoding details (Code):
Tile#1: latents[0:5] -> 17 frames
Tile#2: latents[4:9] -> 16 frames

GabrielSalem changed pull request status to closed

Sign up or log in to comment