jq commited on
Commit
94db1f6
·
verified ·
1 Parent(s): 2c23ded

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -71
README.md CHANGED
@@ -1,78 +1,47 @@
1
  ---
2
- tags:
3
- - generated_from_trainer
4
- base_model: jq/nllb-1.3B-many-to-many-step-2k
5
- datasets:
6
- - generator
7
  model-index:
8
- - name: nllb-1.3B-many-to-many-pronouncorrection-charaug
9
  results: []
 
 
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
-
15
- # nllb-1.3B-many-to-many-pronouncorrection-charaug
16
-
17
- This model is a fine-tuned version of [jq/nllb-1.3B-many-to-many-step-2k](https://huggingface.co/jq/nllb-1.3B-many-to-many-step-2k) on the generator dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: 1.2075
20
- - Bleu Ach Eng: 28.371
21
- - Bleu Lgg Eng: 30.45
22
- - Bleu Lug Eng: 41.978
23
- - Bleu Nyn Eng: 32.296
24
- - Bleu Teo Eng: 30.422
25
- - Bleu Eng Ach: 20.972
26
- - Bleu Eng Lgg: 22.362
27
- - Bleu Eng Lug: 30.359
28
- - Bleu Eng Nyn: 15.305
29
- - Bleu Eng Teo: 21.391
30
- - Bleu Mean: 27.391
31
-
32
- ## Model description
33
-
34
- More information needed
35
-
36
- ## Intended uses & limitations
37
-
38
- More information needed
39
-
40
- ## Training and evaluation data
41
-
42
- More information needed
43
-
44
- ## Training procedure
45
-
46
- ### Training hyperparameters
47
-
48
- The following hyperparameters were used during training:
49
- - learning_rate: 0.0003
50
- - train_batch_size: 25
51
- - eval_batch_size: 25
52
- - seed: 42
53
- - gradient_accumulation_steps: 120
54
- - total_train_batch_size: 3000
55
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
- - lr_scheduler_type: linear
57
- - training_steps: 1500
58
- - mixed_precision_training: Native AMP
59
-
60
- ### Training results
61
-
62
- | Training Loss | Epoch | Step | Validation Loss | Bleu Ach Eng | Bleu Lgg Eng | Bleu Lug Eng | Bleu Nyn Eng | Bleu Teo Eng | Bleu Eng Ach | Bleu Eng Lgg | Bleu Eng Lug | Bleu Eng Nyn | Bleu Eng Teo | Bleu Mean |
63
- |:-------------:|:------:|:----:|:---------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:---------:|
64
- | No log | 0.0667 | 100 | 1.1541 | 29.033 | 31.47 | 41.596 | 34.169 | 32.442 | 19.677 | 19.657 | 27.889 | 14.554 | 19.143 | 26.963 |
65
- | No log | 1.0301 | 200 | 1.1570 | 27.473 | 31.853 | 41.934 | 32.575 | 31.606 | 20.25 | 20.634 | 28.592 | 13.672 | 19.997 | 26.859 |
66
- | No log | 1.0968 | 300 | 1.1288 | 29.086 | 33.257 | 43.387 | 33.678 | 33.579 | 20.377 | 20.91 | 28.906 | 14.992 | 21.013 | 27.919 |
67
- | No log | 2.0603 | 400 | 1.1620 | 28.122 | 31.46 | 42.491 | 33.304 | 32.331 | 20.282 | 21.604 | 29.577 | 14.961 | 20.94 | 27.507 |
68
- | 0.7273 | 3.0237 | 500 | 1.1661 | 28.311 | 32.122 | 42.825 | 32.333 | 32.415 | 19.799 | 22.287 | 29.558 | 15.708 | 21.948 | 27.731 |
69
- | 0.7273 | 3.0904 | 600 | 1.1652 | 28.593 | 30.62 | 41.964 | 33.383 | 32.08 | 21.142 | 21.8 | 30.215 | 14.717 | 21.744 | 27.626 |
70
- | 0.7273 | 4.0538 | 700 | 1.2075 | 28.371 | 30.45 | 41.978 | 32.296 | 30.422 | 20.972 | 22.362 | 30.359 | 15.305 | 21.391 | 27.391 |
71
-
72
-
73
- ### Framework versions
74
-
75
- - Transformers 4.40.1
76
- - Pytorch 2.2.0
77
- - Datasets 2.19.0
78
- - Tokenizers 0.19.1
 
1
  ---
2
+ base_model: facebook/nllb-200-1.3B
 
 
 
 
3
  model-index:
4
+ - name: translate-nllb-1.3b-salt
5
  results: []
6
+ datasets:
7
+ - Sunbird/salt
8
  ---
9
 
10
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
  should probably proofread and complete it, then remove this comment. -->
12
+ # Model details
13
+
14
+ This machine translation model can convert single sentences from and to any of the following languages:
15
+
16
+ | ISO 693-3 | Language name |
17
+ | --- | --- |
18
+ | eng | English |
19
+ | ach | Acholi |
20
+ | lgg | Lugbara |
21
+ | lug | Luganda |
22
+ | nyn | Runyankole |
23
+ | teo | Ateso |
24
+
25
+ It was trained on the [SALT](http://huggingface.co/datasets/Sunbird/salt) dataset and a variety of
26
+ additional external data resources, including back-translated news articles, FLORES-200, MT560 and LAFAND-MT.
27
+ The base model was [facebok/nllb-200-1.3B](https://huggingface.co/facebook/nllb-200-1.3B),
28
+ with tokens adapted to add support for languages not originally included.
29
+
30
+ # Usage
31
+
32
+ # Evaluation metrics
33
+
34
+ Results on salt-dev:
35
+
36
+ | Source language | Target language | BLEU |
37
+ | --- | --- | --- |
38
+ | ach | eng | 28.371 |
39
+ | lgg | eng | 30.45 |
40
+ | lug | eng | 41.978 |
41
+ | nyn | eng |32.296 |
42
+ | teo | eng | 30.422 |
43
+ | eng | ach | 20.972 |
44
+ | eng | lgg | 22.362 |
45
+ | eng | lug | 30.359 |
46
+ | eng | nyn | 15.305 |
47
+ | eng | teo | 21.391 |