Update README.md
Browse files
README.md
CHANGED
@@ -35,8 +35,8 @@ base_model:
|
|
35 |
# Model Card for Teuken-7B-instruct-v0.4
|
36 |
|
37 |
|
38 |
-
Teuken-7B-base-v0.4 is a 7B parameter multilingual large language model (LLM) pre-trained with 4T tokens within the research project OpenGPT-X.
|
39 |
-
Teuken-7B-instruct-v0.4 is an instruction-tuned version of Teuken-7B-base-v0.4.
|
40 |
|
41 |
|
42 |
### Model Description
|
@@ -69,7 +69,7 @@ The model is not intended for use in math and coding tasks.
|
|
69 |
|
70 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
71 |
|
72 |
-
Teuken-7B-instruct-v0.4 is an instruction-tuned version of Teuken-7B-base-v0.4 that is not completely free from biases and hallucinations.
|
73 |
|
74 |
## How to Get Started with the Model
|
75 |
|
@@ -135,7 +135,7 @@ This example demonstrates how to load the model and tokenizer, prepare input, ge
|
|
135 |
|
136 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
137 |
|
138 |
-
Teuken-7B-base-v0.4 was pre-trained on 4 trillion tokens of data from publicly available sources.
|
139 |
The pretraining data has a cutoff of September 2023.
|
140 |
More information are available in our [preprint](http://arxiv.org/abs/2410.08800).
|
141 |
|
@@ -177,7 +177,7 @@ More information are available in our [preprint](http://arxiv.org/abs/2410.08800
|
|
177 |
### Training Procedure
|
178 |
|
179 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
180 |
-
Instruction fined tuned version of Teuken-7B-base-v0.4.
|
181 |
|
182 |
|
183 |
#### Training Hyperparameters
|
|
|
35 |
# Model Card for Teuken-7B-instruct-v0.4
|
36 |
|
37 |
|
38 |
+
[Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) is a 7B parameter multilingual large language model (LLM) pre-trained with 4T tokens within the research project OpenGPT-X.
|
39 |
+
Teuken-7B-instruct-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4).
|
40 |
|
41 |
|
42 |
### Model Description
|
|
|
69 |
|
70 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
71 |
|
72 |
+
Teuken-7B-instruct-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) that is not completely free from biases and hallucinations.
|
73 |
|
74 |
## How to Get Started with the Model
|
75 |
|
|
|
135 |
|
136 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
137 |
|
138 |
+
[Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) was pre-trained on 4 trillion tokens of data from publicly available sources.
|
139 |
The pretraining data has a cutoff of September 2023.
|
140 |
More information are available in our [preprint](http://arxiv.org/abs/2410.08800).
|
141 |
|
|
|
177 |
### Training Procedure
|
178 |
|
179 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
180 |
+
Instruction fined tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4).
|
181 |
|
182 |
|
183 |
#### Training Hyperparameters
|