Update README.md
Browse files
README.md
CHANGED
@@ -33,23 +33,6 @@ DCLM-Baseline-7B is a 7 billion parameter language model trained on the DCLM-Bas
|
|
33 |
- **Dataset:** https://huggingface.co/datasets/mlfoundations/dclm-baseline-1.0
|
34 |
- **Paper:** [DataComp-LM: In search of the next generation of training sets for language models](https://arxiv.org/abs/2406.11794)
|
35 |
|
36 |
-
## Uses
|
37 |
-
|
38 |
-
### Inference
|
39 |
-
|
40 |
-
To use the model for inference:
|
41 |
-
|
42 |
-
```python
|
43 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
44 |
-
|
45 |
-
model = AutoModelForCausalLM.from_pretrained("datacomp/dclm-baseline-7b")
|
46 |
-
tokenizer = AutoTokenizer.from_pretrained("datacomp/dclm-baseline-7b")
|
47 |
-
|
48 |
-
prompt = "Language modeling is"
|
49 |
-
inputs = tokenizer(prompt, return_tensors="pt")
|
50 |
-
outputs = model.generate(**inputs, max_new_tokens=50)
|
51 |
-
print(tokenizer.decode(outputs[0]))
|
52 |
-
```
|
53 |
|
54 |
### Training Details
|
55 |
|
|
|
33 |
- **Dataset:** https://huggingface.co/datasets/mlfoundations/dclm-baseline-1.0
|
34 |
- **Paper:** [DataComp-LM: In search of the next generation of training sets for language models](https://arxiv.org/abs/2406.11794)
|
35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
|
37 |
### Training Details
|
38 |
|