Tried to allocate 564.00 MiB (GPU 0; 7.98 GiB total capacity; 7.52 GiB already allocated; 446.00 MiB free; 7.55 GiB reserved in total by PyTorch)
#25
by
davisitoo
- opened
Is there any compilation argument that I can add to make it work with an 8GB GPU?
The model requires ~16GB of memory to run comfortably. You could have a look at FalconTune to use a 4-bit version of the model.
FalconLLM
changed discussion status to
closed