utrobinmv commited on
Commit
2e59e14
1 Parent(s): 8cfcb8f

add readme

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ru
4
+ - zh
5
+ - en
6
+ tags:
7
+ - translation
8
+ license: apache-2.0
9
+ datasets:
10
+ - ccmatrix
11
+ metrics:
12
+ - sacrebleu
13
+ widget:
14
+ - example_title: translate zh-ru
15
+ text: >
16
+ translate to ru: 开发的目的是为用户提供个人同步翻译。
17
+ - example_title: translate ru-en
18
+ text: >
19
+ translate to en: Цель разработки — предоставить пользователям личного синхронного переводчика.
20
+ - example_title: translate en-ru
21
+ text: >
22
+ translate to ru: The purpose of the development is to provide users with a personal synchronized interpreter.
23
+ - example_title: translate en-zh
24
+ text: >
25
+ translate to zh: The purpose of the development is to provide users with a personal synchronized interpreter.
26
+ - example_title: translate zh-en
27
+ text: >
28
+ translate to en: 开发的目的是为用户提供个人同步解释器。
29
+ - example_title: translate ru-zh
30
+ text: >
31
+ translate to zh: Цель разработки — предоставить пользователям личного синхронного переводчика.
32
+ ---
33
+
34
+ # T5 English, Russian and Chinese multilingual machine translation
35
+
36
+ This model represents a conventional T5 transformer in multitasking mode for translation into the required language, precisely configured for machine translation for pairs: ru-zh, zh-ru, en-zh, zh-en, en-ru, ru-en.
37
+
38
+ The model can perform direct translation between any pair of Russian, Chinese or English languages. For translation into the target language, the target language identifier is specified as a prefix 'translate to <lang>:'. In this case, the source language may not be specified, in addition, the source text may be multilingual.
39
+
40
+ Example translate Russian to Chinese
41
+
42
+ ```python
43
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
44
+
45
+ device = 'cuda' #or 'cpu' for translate on cpu
46
+
47
+ model_name = 'utrobinmv/t5_translate_en_ru_zh_large_1024_v2'
48
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
49
+ model.eval()
50
+ model.to(device)
51
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
52
+
53
+ prefix = 'translate to zh: '
54
+ src_text = prefix + "Съешь ещё этих мягких французских булок."
55
+
56
+ # translate Russian to Chinese
57
+ input_ids = tokenizer(src_text, return_tensors="pt")
58
+
59
+ generated_tokens = model.generate(**input_ids.to(device))
60
+
61
+ result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
62
+ print(result)
63
+ # 再吃这些法国的甜蜜的面包。
64
+ ```
65
+
66
+
67
+
68
+ and Example translate Chinese to Russian
69
+
70
+ ```python
71
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
72
+
73
+ device = 'cuda' #or 'cpu' for translate on cpu
74
+
75
+ model_name = 'utrobinmv/t5_translate_en_ru_zh_large_1024_v2'
76
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
77
+ model.eval()
78
+ model.to(device)
79
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
80
+
81
+ prefix = 'translate to ru: '
82
+ src_text = prefix + "再吃这些法国的甜蜜的面包。"
83
+
84
+ # translate Russian to Chinese
85
+ input_ids = tokenizer(src_text, return_tensors="pt")
86
+
87
+ generated_tokens = model.generate(**input_ids.to(device))
88
+
89
+ result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
90
+ print(result)
91
+ # Съешьте этот сладкий хлеб из Франции.
92
+ ```
93
+
94
+
95
+
96
+ ##
97
+
98
+
99
+
100
+ ## Languages covered
101
+
102
+ Russian (ru_RU), Chinese (zh_CN), English (en_US)