Update README.md
Browse files
README.md
CHANGED
@@ -58,7 +58,7 @@ To train models on the corpus, we first employ the conventional 80-10-10 MLM obj
|
|
58 |
with [MASK] 80% of the time, with Random subwords (from the entire vocabulary) 10% of the time, and leave the remaining 10% unchanged (Same).
|
59 |
|
60 |
To integrate entity-level cross-lingual knowledge into the model, we propose Entity Prediction objectives, where we only mask subwords belonging
|
61 |
-
to an entity. By predicting the masked entities in
|
62 |
languages.
|
63 |
Two different masking strategies are proposed for predicting entities: Whole Entity Prediction (`WEP`) and Partial Entity Prediction (`PEP`).
|
64 |
|
@@ -78,8 +78,7 @@ setting, PEP<sub>MS</sub>, we remove the 10% Random subwords substitution, i.e.
|
|
78 |
subwords and 10% Same subwords from the masking candidates. In the third setting, PEP<sub>M</sub>, we
|
79 |
further remove the 10% Same subwords prediction, essentially predicting only the masked subwords.
|
80 |
|
81 |
-
Prior work has proven it is effective to combine
|
82 |
-
Entity Prediction with MLM for cross-lingual transfer ([Jiang et al., 2020](https://aclanthology.org/2020.emnlp-main.479/)), therefore we investigate the
|
83 |
combination of the Entity Prediction objectives together with MLM on non-entity subwords. Specifically, when combined with MLM, we lower the
|
84 |
entity masking probability (p) to 50% to roughly keep the same overall masking percentage.
|
85 |
This results into the following objectives: WEP + MLM, PEP<sub>MRS</sub> + MLM, PEP<sub>MS</sub> + MLM, PEP<sub>M</sub> + MLM
|
@@ -112,7 +111,7 @@ For results on each downstream task, please refer to the [paper](https://aclanth
|
|
112 |
|
113 |
## How to Get Started with the Model
|
114 |
|
115 |
-
Use the code below to get started with
|
116 |
|
117 |
## Citation
|
118 |
|
|
|
58 |
with [MASK] 80% of the time, with Random subwords (from the entire vocabulary) 10% of the time, and leave the remaining 10% unchanged (Same).
|
59 |
|
60 |
To integrate entity-level cross-lingual knowledge into the model, we propose Entity Prediction objectives, where we only mask subwords belonging
|
61 |
+
to an entity. By predicting the masked entities in EntityCS sentences, we expect the model to capture the semantics of the same entity in different
|
62 |
languages.
|
63 |
Two different masking strategies are proposed for predicting entities: Whole Entity Prediction (`WEP`) and Partial Entity Prediction (`PEP`).
|
64 |
|
|
|
78 |
subwords and 10% Same subwords from the masking candidates. In the third setting, PEP<sub>M</sub>, we
|
79 |
further remove the 10% Same subwords prediction, essentially predicting only the masked subwords.
|
80 |
|
81 |
+
Prior work has proven it is effective to combine Entity Prediction with MLM for cross-lingual transfer ([Jiang et al., 2020](https://aclanthology.org/2020.emnlp-main.479/)), therefore we investigate the
|
|
|
82 |
combination of the Entity Prediction objectives together with MLM on non-entity subwords. Specifically, when combined with MLM, we lower the
|
83 |
entity masking probability (p) to 50% to roughly keep the same overall masking percentage.
|
84 |
This results into the following objectives: WEP + MLM, PEP<sub>MRS</sub> + MLM, PEP<sub>MS</sub> + MLM, PEP<sub>M</sub> + MLM
|
|
|
111 |
|
112 |
## How to Get Started with the Model
|
113 |
|
114 |
+
Use the code below to get started with training: https://github.com/huawei-noah/noah-research/tree/master/NLP/EntityCS
|
115 |
|
116 |
## Citation
|
117 |
|