Spaces:
Running
this is awesome
just spent the last 15 mins watching
Thank you! I must admit I had way too much fun working on this ๐
what I like about it is that it combines multiple art forms, it will be a great showcase for future models
I second the prais and I'd love to read more about the setup. Seems like a non-trivial task to generate the videos and music in real time (unless it was all pre-computed and how is running on a loop).
Really enjoyable to watch, and with each demo like this I found that gradio can handle something I thought was not possible - didn't realise a video stream/full screen video is doable.
The new interpolation is very nice, much smoother!
Thank you @matthoffner
@anwo Here are some explanations about the architecture of the web tv.
The main code of the webtv is located inside the media-server :
manual steps:
- human input to write a short paragraph describing a multi-shot video sequence
- manual submit it to GPT-4 to generate a list of video captions for each shot (the system instructions are extracts from a stable diffusion guide)
- commit the captions to the playlist database
Inside the media-server
space (generation process running in the background):
- for each prompt in the database
- generate a silent 3 seconds video clip with Zeroscope V2 576w (hosted on Hugging Face Spaces)
- upscale the clip with Zeroscope V2 XL (also a HF Space)
- perform frame interpolation with FILM (also a HF Space)
- storage in the Persistent Storage of the media-server Space
Inside the media-server
space (streaming process running in the foreground):
- for each video file in the persistent storage folder
- add it to a new FFmpeg playlist (it's just a .txt file)
- broadcast it over the RTMP protocol using FFmpeg (in FLV format)
- diffusion of the stream using node-media-server
Inside the AI-WebTV
space:
- display the stream using
mpegts.js
- this doesn't work on iPhone, but now there is also a Twitch mirror