Tahsin-Mayeesha
commited on
Commit
•
29a4019
1
Parent(s):
9b64d7f
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Bengali GPT-2
|
2 |
+
|
3 |
+
Bengali GPT-2 demo. Part of the [Huggingface JAX/Flax event](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/). Also features a [finetuned](https://huggingface.co/khalidsaifullaah/bengali-lyricist-gpt2?) model on bengali song lyrics.
|
4 |
+
|
5 |
+
# Model Description
|
6 |
+
|
7 |
+
OpenAI GPT-2 model was proposed in [Language Models are Unsupervised Multitask Learners](https://paperswithcode.com/paper/language-models-are-unsupervised-multitask) paper .Original GPT2 model was a causal (unidirectional) transformer pretrained using language modeling on a very large corpus of ~40 GB of text data. This model has same configuration but has been pretrained on bengali corpus of mC4(multilingual C4) dataset. This flax model has been trained using code provided by Huggingface team from [here](https://github.com/huggingface/transformers/tree/master/examples/flax/language-modeling).
|
8 |
+
|
9 |
+
# Training Details
|
10 |
+
|
11 |
+
Overall Result:
|
12 |
+
|
13 |
+
```Eval loss : 1.45, Eval Perplexity : 3.141```
|
14 |
+
|
15 |
+
Data: [mC4-bn](https://huggingface.co/datasets/mc4)
|
16 |
+
|
17 |
+
Train Steps: 250k steps
|
18 |
+
|
19 |
+
link 🤗 flax-community/gpt2-bengali
|
20 |
+
|
21 |
+
Demo : https://huggingface.co/spaces/flax-community/Gpt2-bengali
|
22 |
+
|
23 |
+
# Usage
|
24 |
+
|
25 |
+
For using the model there are multiple options available. For example using the pipeline directly we can try to generate sentences.
|
26 |
+
|
27 |
+
```
|
28 |
+
from transformers import pipeline
|
29 |
+
|
30 |
+
gpt2_bengali = pipeline('text-generation',model="flax-community/gpt2-bengali", tokenizer='flax-community/gpt2-bengali')
|
31 |
+
```
|
32 |
+
|
33 |
+
Similarly for using the finetuned model on bangla songs we can use following.
|
34 |
+
|
35 |
+
```
|
36 |
+
from transformers import pipeline
|
37 |
+
|
38 |
+
singer = pipeline('text-generation',model="khalidsaifullaah/bengali-lyricist-gpt2", tokenizer='khalidsaifullaah/bengali-lyricist-gpt2')
|
39 |
+
```
|
40 |
+
|
41 |
+
For using on other tasks the model needs to be fine-tuned on custom datasets. Details can be found in huggingface [documentation](https://huggingface.co/transformers/training.html)
|
42 |
+
|
43 |
+
# Contributors
|
44 |
+
* Khalid Saifullah
|
45 |
+
* Tasmiah Tahsin Mayeesha
|
46 |
+
* Ritobrata Ghosh
|
47 |
+
* Ibrahim Musa
|
48 |
+
* M Saiful Bari
|