File size: 1,359 Bytes
2cd945b b2b3a3d 91cebb2 253471f 91cebb2 253471f 9ab77e6 253471f 9ab77e6 33ffce3 9ab77e6 2cd945b 253471f f6045d7 253471f f6045d7 253471f b3d0960 253471f f6045d7 0882f26 253471f 9a6dece f6045d7 253471f f6045d7 0882f26 f6045d7 253471f 9a6dece 253471f f6045d7 253471f 2cd945b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
---
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
language:
- en
- de
- fr
- it
- pt
- hi
- es
- th
library_name: transformers
license: llama3.1
tags:
- facebook
- meta
- pytorch
- llama
- llama-3
model-index:
- name: Meta-Llama-3.1-8B-Instruct-BF4
results: []
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This is a quantized version of `Llama 3.1 8B Instruct`. Quantized to **4-bit** using `bistandbytes` and `accelerate`.
- **Developed by:** Farid Saud @ DSRS
- **License:** llama3.1
- **Base Model:** meta-llama/Meta-Llama-3.1-8B-Instruct
## Use this model
Use a pipeline as a high-level helper:
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
pipe(messages)
```
Load model directly
```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
model = AutoModelForCausalLM.from_pretrained("fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
```
The base model information can be found in the original [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co./meta-llama/Meta-Llama-3.1-8B-Instruct)
|