please create a quantized version, preferably using bitsandbytes!
Really like the model but would like to use it with BitsandBytes...
Hey @ctranslate2-4you , check this out: https://huggingface.co./allenai/olmOCR-7B-0225-preview-GGUF
How do we perform inference on a pair of image + prompt, using gguf?
Hey @ctranslate2-4you , check this out: https://huggingface.co./allenai/olmOCR-7B-0225-preview-GGUF
Nice, but I prefer to use BNB for now. do you guys plan on making one? Otherwise, I'd have to add a bunch of llama.cpp dependencies just for this..
Hey @ctranslate2-4you , check this out: https://huggingface.co./allenai/olmOCR-7B-0225-preview-GGUF
Nice, but I prefer to use BNB for now. do you guys plan on making one? Otherwise, I'd have to add a bunch of llama.cpp dependencies just for this..
Noo, not atm. If you plan to make one, can you push it to HF.