Safetensors
English
olmo2
amanrangapur commited on
Commit
bb8557c
·
verified ·
1 Parent(s): 5dbe404

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -11
README.md CHANGED
@@ -13,22 +13,18 @@ language:
13
 
14
  # Model Card for OLMo2 13B
15
 
16
- OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
17
- The OLMo models are trained on the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset.
18
- We release all code, checkpoints, logs (coming soon), and details involved in training these models.
19
 
20
-
21
-
22
- The core models released in this batch are the following:
23
  | Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
24
  |------|--------|---------|-------------|-----------------|----------------|
25
  | [OLMo2-7B July 2024](https://huggingface.co/allenai/OLMo-7B-0724-hf) | 4 Trillion | 32 | 4096 | 32 | 4096 |
26
  | [OLMo2- 13B July 2024](https://huggingface.co/allenai/OLMo-1B-0724-hf) | 5 Trillion | 40 | 5120 | 42 | 4096 |
27
 
28
-
29
  ## Inference
30
 
31
- Proceed as usual with HuggingFace:
32
  ```python
33
  from transformers import AutoModelForCausalLM, AutoTokenizer
34
  olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo2-13B-1124")
@@ -43,8 +39,16 @@ print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
43
  >> 'Language modeling is the first step to build natural language generation...'
44
  ```
45
 
46
- Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo2-13B-1124", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
47
- The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
 
 
 
 
 
 
 
 
48
 
49
  We have released checkpoints for these models, for every 1000 training steps.
50
  The naming convention is `stepXXX-tokensYYYB`.
@@ -122,7 +126,7 @@ Core model results for OLMo 7B models are found below.
122
 
123
  And for 13B models:
124
 
125
- | task | random | [StableLM 2 1.6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)\* | [Pythia 1B](https://huggingface.co/EleutherAI/pythia-1b) | [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T) | [OLMo 1.0 1B](https://huggingface.co/allenai/OLMo-1B-hf) | **OLMo 1B July 2024** |
126
  | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------ | ----------------- | --------- | -------------------------------------- | ------- | ------ |
127
  | arc_challenge | 25 | 43.81 | 33.11 | 34.78 | 34.45 | 36.5 |
128
  | arc_easy | 25 | 63.68 | 50.18 | 53.16 | 58.07 | 55.3 |
 
13
 
14
  # Model Card for OLMo2 13B
15
 
16
+ OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
17
+ These models are trained on the Dolma dataset. We are releasing all code, checkpoints, logs (coming soon), and associated training details.
18
+ The core models released in this batch include the following:
19
 
 
 
 
20
  | Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
21
  |------|--------|---------|-------------|-----------------|----------------|
22
  | [OLMo2-7B July 2024](https://huggingface.co/allenai/OLMo-7B-0724-hf) | 4 Trillion | 32 | 4096 | 32 | 4096 |
23
  | [OLMo2- 13B July 2024](https://huggingface.co/allenai/OLMo-1B-0724-hf) | 5 Trillion | 40 | 5120 | 42 | 4096 |
24
 
 
25
  ## Inference
26
 
27
+ You can use OLMo with the standard HuggingFace transformers library:
28
  ```python
29
  from transformers import AutoModelForCausalLM, AutoTokenizer
30
  olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo2-13B-1124")
 
39
  >> 'Language modeling is the first step to build natural language generation...'
40
  ```
41
 
42
+ For faster performance, you can quantize the model using the following method:
43
+ ```python
44
+ AutoModelForCausalLM.from_pretrained("allenai/OLMo2-13B-1124",
45
+ torch_dtype=torch.float16,
46
+ load_in_8bit=True) # Requires bitsandbytes package
47
+ ```
48
+ The quantized model is more sensitive to data types and CUDA operations. To avoid potential issues, it's recommended to pass the inputs directly to CUDA using:
49
+ ```python
50
+ inputs.input_ids.to('cuda')
51
+ ```
52
 
53
  We have released checkpoints for these models, for every 1000 training steps.
54
  The naming convention is `stepXXX-tokensYYYB`.
 
126
 
127
  And for 13B models:
128
 
129
+ | Task | Random | [StableLM 2 1.6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)\* | [Pythia 1B](https://huggingface.co/EleutherAI/pythia-1b) | [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T) | [OLMo 1.0 1B](https://huggingface.co/allenai/OLMo-1B-hf) | **OLMo 1B July 2024** |
130
  | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------ | ----------------- | --------- | -------------------------------------- | ------- | ------ |
131
  | arc_challenge | 25 | 43.81 | 33.11 | 34.78 | 34.45 | 36.5 |
132
  | arc_easy | 25 | 63.68 | 50.18 | 53.16 | 58.07 | 55.3 |