File size: 3,968 Bytes
150c1de
55d0dac
150c1de
 
 
 
 
 
 
 
 
 
5cdfc95
 
150c1de
41d1948
 
5cdfc95
 
 
 
150c1de
08fb8cf
5413b0b
 
 
 
 
 
08fb8cf
5413b0b
08fb8cf
 
5413b0b
 
 
 
 
5cdfc95
5413b0b
150c1de
5cdfc95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150c1de
57a4780
 
 
 
150c1de
 
06810cd
 
 
 
 
 
 
 
 
 
 
 
 
 
150c1de
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
inference: false
language:
- ja
- en
- de
- is
- zh
- cs
---
# webbigdata/ALMA-7B-Ja

ALMA-7B-Ja(13.3GB) is a machine translation model that uses ALMA's learning method to translate Japanese to English.  
The [original ALMA-7B (26.95GB)](https://huggingface.co./haoranxu/ALMA-7B) supports English and Russian(ru) translation. This model supports Japanese(ja) and English translations instead of Russian.

Like the original model, This model has been verified that it also has a translation ability between the following languages, but if you want the translation function for these languages, it is better to use the original [ALMA-13B model](https://huggingface.co./haoranxu/ALMA-13B).  

- German(de) and English(en)  
- Chinese(zh) and English(en)  
- Icelandic(is) and English(en)  
- Czech(cs) and English(en)  

Translating from English (en→xx) BLEU/COMET  
Models           | de     | cs     | is     | zh     | ru/jp  | Avg.   |
|----------------|--------|--------|--------|--------|--------|--------|
NLLB-54B         | 34.50/86.45 | 37.60/90.15 | 24.15/81.76 | 27.38/78.91 | 30.96/87.92 | 30.92/85.04 |
GPT-3.5-D        | 31.80/85.61 | 31.30/88.57 | 15.90/76.28 | 38.30/85.76 | 27.50/86.74 | 28.96/84.59 |
ALMA-7B(Original)| 30.31/85.59 | 29.88/89.10 | 25.71/85.52 | 36.87/85.11 | 27.13/86.98 | 29.89/86.49 |
ALMA-7B-Ja(Ours) | 23.70/82.04 | 18.58/81.36 | 12.20/71.59 | 29.06/82.45 | 14.82/85.40 | 19.67/80.57 |

Translating to English (xx→en) BLEU/COMET
Models           | de     | cs     | is     | zh     | ru/jp  | Avg.   |
|----------------|--------|--------|--------|--------|--------|--------|
NLLB-54B         | 26.89/78.94 | 39.11/80.13 | 23.09/71.66 | 16.56/70.70 | 39.11/81.88 | 28.95/76.66 |
GPT-3.5-D        | 30.90/84.79 | 44.50/86.16 | 31.90/82.13 | 25.00/81.62 | 38.50/84.80 | 34.16/83.90 |
ALMA-7B(Original)| 30.26/84.00 | 43.91/85.86 | 35.97/86.03 | 23.75/79.85 | 39.37/84.58 | 34.55/84.02 |
ALMA-7B-Ja(Ours) | 26.41/83.13 | 34.39/83.50 | 24.77/81.12 | 20.60/78.54 | 15.57/78.61 | 24.35/81.76 |

[Sample Code For Free Colab](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_Free_Colab_sample.ipynb)  



## Other Version




### webbigdata-ALMA-7B-Ja-gguf

mmnga made llama.cpp(gguf) version [webbigdata-ALMA-7B-Ja-gguf](https://huggingface.co./mmnga/webbigdata-ALMA-7B-Ja-gguf). Thank you!  
llama.cpp is a tool used primarily on Macs, and gguf is its latest version format. It can be used without gpu.  


### ALMA-7B-Ja-GPTQ-Ja-En
GPTQ is quantized(reduce the size of the model) method and ALMA-7B-Ja-GPTQ has GPTQ quantized version that reduces model size(3.9GB) and memory usage.  
But the performance is probably lower.  And translation ability for languages other than Japanese and English has deteriorated significantly.  

[Sample Code For Free Colab webbigdata/ALMA-7B-Ja-GPTQ-Ja-En](https://huggingface.co./webbigdata/ALMA-7B-Ja-GPTQ-Ja-En)  

If you want to translate the entire file at once, try Colab below.  
[ALMA_7B_Ja_GPTQ_Ja_En_batch_translation_sample](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_GPTQ_Ja_En_batch_translation_sample.ipynb)




**ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance. 
Please find more details in their [paper](https://arxiv.org/abs/2309.11674).
```
@misc{xu2023paradigm,
      title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models}, 
      author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
      year={2023},
      eprint={2309.11674},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```


## about this work
- **This work was done by :** [webbigdata](https://webbigdata.jp/).