Spaces:
Running
โก ZeroGPU: New version rolled out! (sept 2024)
Hello everybody,
We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.
Major improvements:
- GPU cold starts about twice as fast!
- RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community!
- ZeroGPU initializations (coldstarts) can now be tracked and displayed (use
progress=gr.Progress(track_tqdm=True)
) - Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!
Feel free to answer in this discussion if you have any questions!
๐ค Best regards,
Charles
Hi Charles.
The results from ZeroGPU differ from those on my local machine / Hugging Face's L4 GPU, even with the same code and Python dependencies.
For more information, visit: https://huggingface.co./spaces/zero-gpu-explorers/README/discussions/111
Hi.
I found a very strange behavior. It is hard to find and would never happen locally. Maybe it is related to the bug above.
https://huggingface.co./spaces/zero-gpu-explorers/README/discussions/104#66f66a4b693f423f5b6d9b2e
Apparently, this time the behavior of Gradio's Cancel task is wrong. If it is bad, it may be a problem with Queue in general.
https://huggingface.co./spaces/zero-gpu-explorers/README/discussions/113#66fbc59085944df7944ff4aa
Hi @cbensimon !
Is there an example of how to show cold-start time to users as mentioned here?:
ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))
In my Zero GPU code, I assumed this meant to add it to the spaces.GPU decorator as
@spaces.GPU(duration=40, progress=gr.Progress(track_tqdm=True))
But I'm not seeing any visual indicator! There's no error thrown, but also no difference with that arg on or off, so not quite sure what I need to change! Thank you for your help!
(Code here if it helps: https://huggingface.co./spaces/WillHeld/diva-audio-chat/blob/main/app.py#L61)
I have heard that nest-asyncio
, which is newly starting to be used in the spaces
library, has quite a few problems around memory management.
I would like the library authors to find another alternative if possible.
I found that my model inference is much slower (~ x5 slower) running on zeroGPU than on my local GPU (V100, 16GB). May I know if there is a way to speed it up?