shanearora
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -9,9 +9,9 @@ language:
|
|
9 |
|
10 |
<img src="https://allenai.org/olmo/olmo-7b-animation.gif" alt="OLMo Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
11 |
|
12 |
-
# Model Card for OLMo
|
13 |
|
14 |
-
OLMo
|
15 |
**This version is for direct use with HuggingFace Transformers** from v4.40 on.
|
16 |
|
17 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
@@ -26,27 +26,22 @@ The core models released in this batch are the following:
|
|
26 |
| [OLMo 1B](https://huggingface.co/allenai/OLMo-1B) | 3 Trillion |16 | 2048 | 16 | 2048 |
|
27 |
| [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) | 2.5 Trillion | 32 | 4096 | 32 | 2048 |
|
28 |
| [OLMo 7B Twin 2T](https://huggingface.co/allenai/OLMo-7B-Twin-2T) | 2 Trillion | 32 | 4096 | 32 | 2048 |
|
29 |
-
| [OLMo
|
30 |
|
31 |
-
*Note: OLMo
|
32 |
-
|
33 |
-
|
34 |
-
[Coming soon] We are releasing many checkpoints for these models, for every 1000 training steps.
|
35 |
-
The naming convention is `step1000-tokens4B`.
|
36 |
|
37 |
To load a specific model revision with HuggingFace, simply add the argument `revision`:
|
38 |
```bash
|
39 |
-
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-
|
40 |
```
|
41 |
|
42 |
All revisions/branches are listed in the file `revisions.txt`.
|
43 |
Or, you can access all the revisions for the models via the following code snippet:
|
44 |
```python
|
45 |
from huggingface_hub import list_repo_refs
|
46 |
-
out = list_repo_refs("allenai/OLMo-
|
47 |
branches = [b.name for b in out.branches]
|
48 |
```
|
49 |
-
A few revisions were lost due to an error, but the vast majority are present.
|
50 |
|
51 |
### Model Description
|
52 |
|
@@ -75,13 +70,11 @@ A few revisions were lost due to an error, but the vast majority are present.
|
|
75 |
|
76 |
### Inference
|
77 |
|
78 |
-
|
79 |
-
|
80 |
-
Now, proceed as usual with HuggingFace:
|
81 |
```python
|
82 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
83 |
-
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-
|
84 |
-
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-
|
85 |
message = ["Language modeling is "]
|
86 |
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
|
87 |
# optional verifying cuda
|
@@ -94,20 +87,14 @@ print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
|
|
94 |
Alternatively, with the pipeline abstraction:
|
95 |
```python
|
96 |
from transformers import pipeline
|
97 |
-
olmo_pipe = pipeline("text-generation", model="allenai/OLMo-
|
98 |
print(olmo_pipe("Language modeling is "))
|
99 |
>> 'Language modeling is a branch of natural language processing that aims to...'
|
100 |
```
|
101 |
|
102 |
-
Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-
|
103 |
The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
|
104 |
|
105 |
-
Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
|
106 |
-
```bash
|
107 |
-
raise ImportError(
|
108 |
-
ImportError: This modeling file requires the following packages that were not found in your environment: hf_olmo. Run `pip install hf_olmo`
|
109 |
-
```
|
110 |
-
|
111 |
### Fine-tuning
|
112 |
Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
|
113 |
1. Fine-tune with the OLMo repository:
|
@@ -225,7 +212,7 @@ Optimizer settings comparison with peer models.
|
|
225 |
|
226 |
|
227 |
|
228 |
-
## Environmental Impact
|
229 |
|
230 |
OLMo 7B variants were either trained on MI250X GPUs at the LUMI supercomputer, or A100-40GB GPUs provided by MosaicML.
|
231 |
A summary of the environmental impact. Further details are available in the paper.
|
@@ -233,7 +220,7 @@ A summary of the environmental impact. Further details are available in the pape
|
|
233 |
| | GPU Type | Power Consumption From GPUs | Carbon Intensity (kg CO₂e/KWh) | Carbon Emissions (tCO₂eq) |
|
234 |
|-----------|------------|-----------------------------|--------------------------------|---------------------------|
|
235 |
| OLMo 7B Twin | MI250X ([LUMI supercomputer](https://www.lumi-supercomputer.eu)) | 135 MWh | 0* | 0* |
|
236 |
-
| OLMo 7B | A100-40GB ([MosaicML](https://www.mosaicml.com)) | 104 MWh | 0.656 | 75.05 |
|
237 |
|
238 |
## Bias, Risks, and Limitations
|
239 |
|
|
|
9 |
|
10 |
<img src="https://allenai.org/olmo/olmo-7b-animation.gif" alt="OLMo Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
11 |
|
12 |
+
# Model Card for OLMo 7B April 2024
|
13 |
|
14 |
+
OLMo 7B April 2024 is an updated version of the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model rocking a 24 point increase in MMLU, among other evaluations improvements, from an improved version of the Dolma dataset and staged training.
|
15 |
**This version is for direct use with HuggingFace Transformers** from v4.40 on.
|
16 |
|
17 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
|
|
26 |
| [OLMo 1B](https://huggingface.co/allenai/OLMo-1B) | 3 Trillion |16 | 2048 | 16 | 2048 |
|
27 |
| [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) | 2.5 Trillion | 32 | 4096 | 32 | 2048 |
|
28 |
| [OLMo 7B Twin 2T](https://huggingface.co/allenai/OLMo-7B-Twin-2T) | 2 Trillion | 32 | 4096 | 32 | 2048 |
|
29 |
+
| [OLMo 7B April 2024](https://huggingface.co/allenai/OLMo-7B-0424-hf) | 2.05 Trillion | 32 | 4096 | 32 | 4096 |
|
30 |
|
31 |
+
*Note: OLMo 7B April 2024 also includes QKV clipping.*
|
|
|
|
|
|
|
|
|
32 |
|
33 |
To load a specific model revision with HuggingFace, simply add the argument `revision`:
|
34 |
```bash
|
35 |
+
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-0424-hf", revision="step1000-tokens4B")
|
36 |
```
|
37 |
|
38 |
All revisions/branches are listed in the file `revisions.txt`.
|
39 |
Or, you can access all the revisions for the models via the following code snippet:
|
40 |
```python
|
41 |
from huggingface_hub import list_repo_refs
|
42 |
+
out = list_repo_refs("allenai/OLMo-7B-0424-hf")
|
43 |
branches = [b.name for b in out.branches]
|
44 |
```
|
|
|
45 |
|
46 |
### Model Description
|
47 |
|
|
|
70 |
|
71 |
### Inference
|
72 |
|
73 |
+
Proceed as usual with HuggingFace:
|
|
|
|
|
74 |
```python
|
75 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
76 |
+
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-0424-hf")
|
77 |
+
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-7B-0424-hf")
|
78 |
message = ["Language modeling is "]
|
79 |
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
|
80 |
# optional verifying cuda
|
|
|
87 |
Alternatively, with the pipeline abstraction:
|
88 |
```python
|
89 |
from transformers import pipeline
|
90 |
+
olmo_pipe = pipeline("text-generation", model="allenai/OLMo-7B-0424-hf")
|
91 |
print(olmo_pipe("Language modeling is "))
|
92 |
>> 'Language modeling is a branch of natural language processing that aims to...'
|
93 |
```
|
94 |
|
95 |
+
Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-0424-hf", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
|
96 |
The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
|
97 |
|
|
|
|
|
|
|
|
|
|
|
|
|
98 |
### Fine-tuning
|
99 |
Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
|
100 |
1. Fine-tune with the OLMo repository:
|
|
|
212 |
|
213 |
|
214 |
|
215 |
+
<!-- ## Environmental Impact
|
216 |
|
217 |
OLMo 7B variants were either trained on MI250X GPUs at the LUMI supercomputer, or A100-40GB GPUs provided by MosaicML.
|
218 |
A summary of the environmental impact. Further details are available in the paper.
|
|
|
220 |
| | GPU Type | Power Consumption From GPUs | Carbon Intensity (kg CO₂e/KWh) | Carbon Emissions (tCO₂eq) |
|
221 |
|-----------|------------|-----------------------------|--------------------------------|---------------------------|
|
222 |
| OLMo 7B Twin | MI250X ([LUMI supercomputer](https://www.lumi-supercomputer.eu)) | 135 MWh | 0* | 0* |
|
223 |
+
| OLMo 7B | A100-40GB ([MosaicML](https://www.mosaicml.com)) | 104 MWh | 0.656 | 75.05 | -->
|
224 |
|
225 |
## Bias, Risks, and Limitations
|
226 |
|