SaiedAlshahrani
commited on
Commit
·
32f8988
1
Parent(s):
995dd4e
Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,6 @@ It achieves the following results on the evaluation set:
|
|
29 |
|
30 |
- Pseudo-Perplexity: 115.80
|
31 |
|
32 |
-
|
33 |
## Model description
|
34 |
|
35 |
We trained this Egyptian Arabic Wikipedia Masked Language Model (arzRoBERTa<sub>BASE</sub>) to evaluate its performance using the Fill-Mask evaluation task and the Masked Arab States Dataset ([MASD](https://huggingface.co/datasets/SaiedAlshahrani/MASD)) dataset and measure the *impact* of **template-based translation** on the Egyptian Arabic Wikipedia edition.
|
@@ -52,22 +51,18 @@ For more details about the experiment, please **read** and **cite** our paper:
|
|
52 |
}
|
53 |
```
|
54 |
|
55 |
-
|
56 |
## Intended uses & limitations
|
57 |
|
58 |
We do **not** recommend using this model because it was trained *only* on the Egyptian Arabic Wikipedia articles, which are known by the template-based translation from English, producing limited, shallow, and unrepresentative articles, <u>unless</u> you fine-tune the model on a large, organic, and representative Egyptian dataset.
|
59 |
|
60 |
-
|
61 |
## Training and evaluation data
|
62 |
|
63 |
We have trained this model on the Egyptian Arabic Wikipedia articles ([SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101](https://huggingface.co/datasets/SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101)) without using any validation or evaluation data (only training data) due to a lack of computational power.
|
64 |
|
65 |
-
|
66 |
## Training procedure
|
67 |
|
68 |
We have trained this model using the Paperspace GPU-Cloud service. We used a machine with 8 CPUs, 45GB RAM, and A6000 GPU with 48GB RAM.
|
69 |
|
70 |
-
|
71 |
### Training hyperparameters
|
72 |
|
73 |
The following hyperparameters were used during training:
|
@@ -79,7 +74,6 @@ The following hyperparameters were used during training:
|
|
79 |
- lr_scheduler_type: linear
|
80 |
- num_epochs: 5
|
81 |
|
82 |
-
|
83 |
### Training results
|
84 |
|
85 |
| Epoch | Step | Training Loss |
|
@@ -94,14 +88,12 @@ The following hyperparameters were used during training:
|
|
94 |
|:--------------:|:------------------------:|:----------------------:|:-------------------------:|:----------:|:--------:|
|
95 |
| 14677.117400 | 248.119000 | 0.970000 | 120746231839334400.000000 | 0.908513 | 5.000000 |
|
96 |
|
97 |
-
|
98 |
### Evaluation results
|
99 |
This arzRoBERTa<sub>BASE</sub> model has been evaluated on the Masked Arab States Dataset ([SaiedAlshahrani/MASD](https://huggingface.co/datasets/SaiedAlshahrani/MASD)).
|
100 |
| K=10 | K=50 | K=100 |
|
101 |
|:----:|:-----:|:----:|
|
102 |
| 8.12%| 25.62% | 35% |
|
103 |
|
104 |
-
|
105 |
### Framework versions
|
106 |
|
107 |
- Datasets 2.9.0
|
|
|
29 |
|
30 |
- Pseudo-Perplexity: 115.80
|
31 |
|
|
|
32 |
## Model description
|
33 |
|
34 |
We trained this Egyptian Arabic Wikipedia Masked Language Model (arzRoBERTa<sub>BASE</sub>) to evaluate its performance using the Fill-Mask evaluation task and the Masked Arab States Dataset ([MASD](https://huggingface.co/datasets/SaiedAlshahrani/MASD)) dataset and measure the *impact* of **template-based translation** on the Egyptian Arabic Wikipedia edition.
|
|
|
51 |
}
|
52 |
```
|
53 |
|
|
|
54 |
## Intended uses & limitations
|
55 |
|
56 |
We do **not** recommend using this model because it was trained *only* on the Egyptian Arabic Wikipedia articles, which are known by the template-based translation from English, producing limited, shallow, and unrepresentative articles, <u>unless</u> you fine-tune the model on a large, organic, and representative Egyptian dataset.
|
57 |
|
|
|
58 |
## Training and evaluation data
|
59 |
|
60 |
We have trained this model on the Egyptian Arabic Wikipedia articles ([SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101](https://huggingface.co/datasets/SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101)) without using any validation or evaluation data (only training data) due to a lack of computational power.
|
61 |
|
|
|
62 |
## Training procedure
|
63 |
|
64 |
We have trained this model using the Paperspace GPU-Cloud service. We used a machine with 8 CPUs, 45GB RAM, and A6000 GPU with 48GB RAM.
|
65 |
|
|
|
66 |
### Training hyperparameters
|
67 |
|
68 |
The following hyperparameters were used during training:
|
|
|
74 |
- lr_scheduler_type: linear
|
75 |
- num_epochs: 5
|
76 |
|
|
|
77 |
### Training results
|
78 |
|
79 |
| Epoch | Step | Training Loss |
|
|
|
88 |
|:--------------:|:------------------------:|:----------------------:|:-------------------------:|:----------:|:--------:|
|
89 |
| 14677.117400 | 248.119000 | 0.970000 | 120746231839334400.000000 | 0.908513 | 5.000000 |
|
90 |
|
|
|
91 |
### Evaluation results
|
92 |
This arzRoBERTa<sub>BASE</sub> model has been evaluated on the Masked Arab States Dataset ([SaiedAlshahrani/MASD](https://huggingface.co/datasets/SaiedAlshahrani/MASD)).
|
93 |
| K=10 | K=50 | K=100 |
|
94 |
|:----:|:-----:|:----:|
|
95 |
| 8.12%| 25.62% | 35% |
|
96 |
|
|
|
97 |
### Framework versions
|
98 |
|
99 |
- Datasets 2.9.0
|