Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
inference: false
|
3 |
+
language:
|
4 |
+
- ja
|
5 |
+
- en
|
6 |
+
- de
|
7 |
+
- is
|
8 |
+
- zh
|
9 |
+
- cs
|
10 |
+
---
|
11 |
+
# webbigdata/ALMA-7B-Ja
|
12 |
+
|
13 |
+
**ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance.
|
14 |
+
Please find more details in our [paper](https://arxiv.org/abs/2309.11674).
|
15 |
+
```
|
16 |
+
@misc{xu2023paradigm,
|
17 |
+
title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models},
|
18 |
+
author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
|
19 |
+
year={2023},
|
20 |
+
eprint={2309.11674},
|
21 |
+
archivePrefix={arXiv},
|
22 |
+
primaryClass={cs.CL}
|
23 |
+
}
|
24 |
+
```
|
25 |
+
|
26 |
+
Original ALMA Model [ALMA-7B](https://huggingface.co/haoranxu/ALMA-7B). (26.95GB)
|
27 |
+
https://huggingface.co/haoranxu/ALMA-7B
|
28 |
+
|
29 |
+
ALMA-7B-Ja is a machine translation model that uses ALMA's learning method to translate Japanese to English.(13.3GB)
|
30 |
+
|
31 |
+
Like the original model, This model has been verified that it also has a translation function between the following languages, but if you want the translation function for these languages, it is better to use the original model.
|
32 |
+
|
33 |
+
german and english
|
34 |
+
Chinese and English
|
35 |
+
Icelandic and English
|
36 |
+
Czech and English
|
37 |
+
|
38 |
+
|
39 |
+
[Sample Code For Free Colab](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_Free_Colab_sample.ipynb)
|
40 |
+
|
41 |
+
There is also a GPTQ quantized version model that reduces model size(3.9GB) and memory usage, although the performance is probably lower.
|
42 |
+
And translation ability for languages other than Japanese and English has deteriorated significantly.
|
43 |
+
[webbigdata/ALMA-7B-Ja-GPTQ-Ja-En](https://huggingface.co/webbigdata/ALMA-7B-Ja-GPTQ-Ja-En)
|
44 |
+
|
45 |
+
|
46 |
+
## about this work
|
47 |
+
- **This work was done by :** [webbigdata](https://webbigdata.jp/).
|