Update README.md
Browse files
README.md
CHANGED
@@ -17,6 +17,8 @@ model-index:
|
|
17 |
|
18 |
# Suzume
|
19 |
|
|
|
|
|
20 |
This Suzume 8B, a Japanese finetune of Llama 3.
|
21 |
|
22 |
Llama 3 has exhibited excellent performance on many English language benchmarks.
|
@@ -186,3 +188,22 @@ The following hyperparameters were used during training:
|
|
186 |
- Pytorch 2.2.1+cu121
|
187 |
- Datasets 2.18.0
|
188 |
- Tokenizers 0.15.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
# Suzume
|
19 |
|
20 |
+
[[Paper](https://arxiv.org/abs/2405.12612)] [[Dataset](https://huggingface.co/datasets/lightblue/tagengo-gpt4)]
|
21 |
+
|
22 |
This Suzume 8B, a Japanese finetune of Llama 3.
|
23 |
|
24 |
Llama 3 has exhibited excellent performance on many English language benchmarks.
|
|
|
188 |
- Pytorch 2.2.1+cu121
|
189 |
- Datasets 2.18.0
|
190 |
- Tokenizers 0.15.0
|
191 |
+
|
192 |
+
# How to cite
|
193 |
+
|
194 |
+
Please cite [this paper](https://arxiv.org/abs/2405.12612) when referencing this model.
|
195 |
+
|
196 |
+
```tex
|
197 |
+
@misc{devine2024tagengo,
|
198 |
+
title={Tagengo: A Multilingual Chat Dataset},
|
199 |
+
author={Peter Devine},
|
200 |
+
year={2024},
|
201 |
+
eprint={2405.12612},
|
202 |
+
archivePrefix={arXiv},
|
203 |
+
primaryClass={cs.CL}
|
204 |
+
}
|
205 |
+
```
|
206 |
+
|
207 |
+
# Developer
|
208 |
+
|
209 |
+
Peter Devine - ([ptrdvn](https://huggingface.co/ptrdvn))
|