---
library_name: transformers
base_model:
- meta-llama/Meta-Llama-3-8B-Instruct
---

# Model Card for Llama3-8B-1.58-100B-tokens-GGUF

### Llama3-8B-1.58-100B-tokens-GGUF

The **Llama3-8B-1.58-100B-tokens-q2b0** is a quantized version of **Llama3-8B-1.58-100B-tokens-GGUF**, leveraging the **q2b0 quantization method** from Candle. This enables extreme compression while maintaining strong performance across various NLP tasks.

## Model Details

### Model Sources

- **Repository:** [HF1BitLLM/Llama3-8B-1.58-100B-tokens](https://huggingface.co./HF1BitLLM/Llama3-8B-1.58-100B-tokens)
- **Quantization PR:** [Candle q2b0 Quantization](https://github.com/huggingface/candle/pull/2683)

## Quantization Details

The model has been quantized using the **q2b0** method from Candle. This approach reduces model size significantly while preserving performance. For more details on this quantization technique, refer to the [Candle PR #2683](https://github.com/huggingface/candle/pull/2683).

## Training Details

For details on the dataset and training process, refer to the original [Llama3-8B-1.58-100B-tokens](HF1BitLLM/Llama3-8B-1.58-100B-tokens).