File size: 1,359 Bytes
2cd945b
b2b3a3d
91cebb2
 
253471f
 
91cebb2
253471f
 
 
 
9ab77e6
 
253471f
 
 
 
 
 
9ab77e6
33ffce3
9ab77e6
2cd945b
 
 
 
 
253471f
f6045d7
253471f
f6045d7
253471f
b3d0960
253471f
f6045d7
 
 
 
0882f26
253471f
 
 
 
 
 
9a6dece
f6045d7
253471f
 
f6045d7
 
 
0882f26
f6045d7
253471f
 
9a6dece
 
253471f
 
f6045d7
253471f
 
2cd945b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
language:
- en
- de
- fr
- it
- pt
- hi
- es
- th
library_name: transformers
license: llama3.1
tags:
- facebook
- meta
- pytorch
- llama
- llama-3
model-index:
- name: Meta-Llama-3.1-8B-Instruct-BF4
  results: []
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

This is a quantized version of `Llama 3.1 8B Instruct`. Quantized to **4-bit** using `bistandbytes` and `accelerate`.

- **Developed by:** Farid Saud @ DSRS
- **License:** llama3.1
- **Base Model:** meta-llama/Meta-Llama-3.1-8B-Instruct

## Use this model


Use a pipeline as a high-level helper:
```python
# Use a pipeline as a high-level helper
from transformers import pipeline

messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
pipe(messages)
```



Load model directly
```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
model = AutoModelForCausalLM.from_pretrained("fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
```

The base model information can be found in the original [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co./meta-llama/Meta-Llama-3.1-8B-Instruct)