JoseCarlosGarcia95's picture
first commit
5748f27
---
library_name: transformers
base_model:
- meta-llama/Meta-Llama-3-8B-Instruct
---
# Model Card for Llama3-8B-1.58-100B-tokens-GGUF
### Llama3-8B-1.58-100B-tokens-GGUF
The **Llama3-8B-1.58-100B-tokens-q2b0** is a quantized version of **Llama3-8B-1.58-100B-tokens-GGUF**, leveraging the **q2b0 quantization method** from Candle. This enables extreme compression while maintaining strong performance across various NLP tasks.
## Model Details
### Model Sources
- **Repository:** [HF1BitLLM/Llama3-8B-1.58-100B-tokens](https://huggingface.co./HF1BitLLM/Llama3-8B-1.58-100B-tokens)
- **Quantization PR:** [Candle q2b0 Quantization](https://github.com/huggingface/candle/pull/2683)
## Quantization Details
The model has been quantized using the **q2b0** method from Candle. This approach reduces model size significantly while preserving performance. For more details on this quantization technique, refer to the [Candle PR #2683](https://github.com/huggingface/candle/pull/2683).
## Training Details
For details on the dataset and training process, refer to the original [Llama3-8B-1.58-100B-tokens](HF1BitLLM/Llama3-8B-1.58-100B-tokens).