Update README.md
Browse files
README.md
CHANGED
@@ -17,6 +17,8 @@ model-index:
|
|
17 |
|
18 |
# Suzume
|
19 |
|
|
|
|
|
20 |
This Suzume 8B, a Japanese finetune of Llama 3.
|
21 |
|
22 |
Llama 3 has exhibited excellent performance on many English language benchmarks.
|
@@ -157,3 +159,22 @@ The following hyperparameters were used during training:
|
|
157 |
- Pytorch 2.2.1+cu121
|
158 |
- Datasets 2.18.0
|
159 |
- Tokenizers 0.15.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
# Suzume
|
19 |
|
20 |
+
[[Paper](https://arxiv.org/abs/2405.12612)] [[Dataset](https://huggingface.co/datasets/lightblue/tagengo-gpt4)]
|
21 |
+
|
22 |
This Suzume 8B, a Japanese finetune of Llama 3.
|
23 |
|
24 |
Llama 3 has exhibited excellent performance on many English language benchmarks.
|
|
|
159 |
- Pytorch 2.2.1+cu121
|
160 |
- Datasets 2.18.0
|
161 |
- Tokenizers 0.15.0
|
162 |
+
|
163 |
+
# How to cite
|
164 |
+
|
165 |
+
Please cite [this paper](https://arxiv.org/abs/2405.12612) when referencing this model.
|
166 |
+
|
167 |
+
```tex
|
168 |
+
@misc{devine2024tagengo,
|
169 |
+
title={Tagengo: A Multilingual Chat Dataset},
|
170 |
+
author={Peter Devine},
|
171 |
+
year={2024},
|
172 |
+
eprint={2405.12612},
|
173 |
+
archivePrefix={arXiv},
|
174 |
+
primaryClass={cs.CL}
|
175 |
+
}
|
176 |
+
```
|
177 |
+
|
178 |
+
# Developer
|
179 |
+
|
180 |
+
Peter Devine - ([ptrdvn](https://huggingface.co/ptrdvn))
|