ggml versions of Flan-Open-Llama-3b
- Announcement: Tweet by @EnricoShippole ("open-source")
- Model: conceptofmind/Flan-Open-Llama-3b
- Base Model: openlm-research/open_llama_3b OpenLLaMA: An Open Reproduction of LLaMA (Apache 2.0)
- Dataset: FLAN (Apache 2.0)
- llama.cpp: build 607(ffb06a3) or later
- Type: instruct
Use with llama.cpp
Support is now merged to master branch.
K-quants
There are now more quantization types in llama.cpp, some lower than 4 bits.
Currently these are not well supported because of technical reasons.
If you want to use them, you have to build llama.cpp (from build 829 (ff5d58f)) with the LLAMA_QKK_64
Make or CMake variable enabled (see PR #2001).
Then you can quantize the F16 or maybe Q8_0 version to what you want.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.