ggml versions of Flan-Open-Llama-3b

Use with llama.cpp

Support is now merged to master branch.

K-quants

There are now more quantization types in llama.cpp, some lower than 4 bits. Currently these are not well supported because of technical reasons. If you want to use them, you have to build llama.cpp (from build 829 (ff5d58f)) with the LLAMA_QKK_64 Make or CMake variable enabled (see PR #2001). Then you can quantize the F16 or maybe Q8_0 version to what you want.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.