simbolo-ai
/

Myanmarsar-GPT

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Simbolo commited on Jan 7, 2024

Commit

6341760

·

1 Parent(s): 38eaf08

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -23,13 +23,16 @@ input_text = ""
 input_ids = tokenizer.encode(input_text, return_tensors='pt')
 output = model.generate(input_ids, max_length=100)
 print(tokenizer.decode(output[0], skip_special_tokens=True))
 ```
 ### Limitations and bias
 We have yet to thoroughly investigate the potential bias inherent in this model. Regarding transparency, it's important to note that the model is primarily trained on data from the Unicode Burmese(Myanmar) language.

 input_ids = tokenizer.encode(input_text, return_tensors='pt')
 output = model.generate(input_ids, max_length=100)
 print(tokenizer.decode(output[0], skip_special_tokens=True))
 ```
+### Data
+The data utilized comprises 1 million sentences sourced from Wikipedia.
 ### Limitations and bias
 We have yet to thoroughly investigate the potential bias inherent in this model. Regarding transparency, it's important to note that the model is primarily trained on data from the Unicode Burmese(Myanmar) language.
+### References and Citations
+Jiang, Shengyi & Huang, Xiuwen & Cai, Xiaonan & Lin, Nankai. (2021). Pre-trained Models and Evaluation Data for the Myanmar Language. 10.1007/978-3-030-92310-5_52.