GPU VRAM requirements

#1
by bbooth - opened

Has anyone gotten this to run on an NVidia GPU?

I keep running out of memory on a 3090 24Gig.

Artificial Intelligence & Machine Learning Lab at TU Darmstadt org

I have not tried the 3090 jet, but it works fine on the H100 and A100 vhips. Maybe you can try quantization if the model doesn't fit.

Artificial Intelligence & Machine Learning Lab at TU Darmstadt org

But as discussed here inference on the 7b model should also work with 24gb vram.

Thanks for the quick reply! I thought the 7B-hf model should fit in the 3090-24GB using the supplied code, but will try 8-bit quantization.

LukasHug changed discussion status to closed

Sign up or log in to comment