Quantization support
#12
by
Kernel
- opened
Is there any possible way to get 8bit quantization? BTW what is the model size? 7B? Cant find this information
Hi,
There is an on going effort to port Kosmos-2 directly into transformers. This repository (remote code) might need some more bug fixes later, including breaking changes.
I would suggest to wait for the official support (where I will try to make it work with quantization
, but I can't 100% guarantee at this moment)
Regarding the model size, the paper says The total number of trainable parameters amounts to approximately 1.6B
, but I didn't check it myself. The model file (pytorch bin file) is 6.6 GB however.
close this issue. Feel free to open once an official port is merged into transformers
. Thank you.
ydshieh
changed discussion status to
closed