Update README.md
Browse files
README.md
CHANGED
@@ -23,13 +23,16 @@ input_text = ""
|
|
23 |
input_ids = tokenizer.encode(input_text, return_tensors='pt')
|
24 |
output = model.generate(input_ids, max_length=100)
|
25 |
print(tokenizer.decode(output[0], skip_special_tokens=True))
|
|
|
26 |
```
|
|
|
|
|
27 |
|
28 |
### Limitations and bias
|
29 |
We have yet to thoroughly investigate the potential bias inherent in this model. Regarding transparency, it's important to note that the model is primarily trained on data from the Unicode Burmese(Myanmar) language.
|
30 |
|
31 |
-
|
32 |
-
|
33 |
|
34 |
|
35 |
|
|
|
23 |
input_ids = tokenizer.encode(input_text, return_tensors='pt')
|
24 |
output = model.generate(input_ids, max_length=100)
|
25 |
print(tokenizer.decode(output[0], skip_special_tokens=True))
|
26 |
+
|
27 |
```
|
28 |
+
### Data
|
29 |
+
The data utilized comprises 1 million sentences sourced from Wikipedia.
|
30 |
|
31 |
### Limitations and bias
|
32 |
We have yet to thoroughly investigate the potential bias inherent in this model. Regarding transparency, it's important to note that the model is primarily trained on data from the Unicode Burmese(Myanmar) language.
|
33 |
|
34 |
+
### References and Citations
|
35 |
+
Jiang, Shengyi & Huang, Xiuwen & Cai, Xiaonan & Lin, Nankai. (2021). Pre-trained Models and Evaluation Data for the Myanmar Language. 10.1007/978-3-030-92310-5_52.
|
36 |
|
37 |
|
38 |
|