--- library_name: transformers base_model: - meta-llama/Meta-Llama-3-8B-Instruct --- # Model Card for Llama3-8B-1.58-100B-tokens-GGUF ### Llama3-8B-1.58-100B-tokens-GGUF The **Llama3-8B-1.58-100B-tokens-q2b0** is a quantized version of **Llama3-8B-1.58-100B-tokens-GGUF**, leveraging the **q2b0 quantization method** from Candle. This enables extreme compression while maintaining strong performance across various NLP tasks. ## Model Details ### Model Sources - **Repository:** [HF1BitLLM/Llama3-8B-1.58-100B-tokens](https://huggingface.co./HF1BitLLM/Llama3-8B-1.58-100B-tokens) - **Quantization PR:** [Candle q2b0 Quantization](https://github.com/huggingface/candle/pull/2683) ## Quantization Details The model has been quantized using the **q2b0** method from Candle. This approach reduces model size significantly while preserving performance. For more details on this quantization technique, refer to the [Candle PR #2683](https://github.com/huggingface/candle/pull/2683). ## Training Details For details on the dataset and training process, refer to the original [Llama3-8B-1.58-100B-tokens](HF1BitLLM/Llama3-8B-1.58-100B-tokens).