GPU memory usage/requirement?
#1
by
Bilibili
- opened
Thanks for this work!
Since the original StarCoder requires 60+ GB GPU RAM for inference, I wonder what about the GPTQ version, and could the model run inference on V100-32G?
Bilibili
changed discussion title from
GPU memory usage peak?
to GPU memory usage requirement?
Bilibili
changed discussion title from
GPU memory usage requirement?
to GPU memory usage/requirement?
I'm totally new to GPTQ and am not exactly sure how to calculate the exacts, but it seems happy with 20-30 gigs from my CPU's ram, and I have only 12 gigs used in my GPU.
Yes 32GB is more than enough VRAM for nearly any model in GPTQ. This one needs around 12GB yeah