|
--- |
|
language: |
|
- en |
|
- hi |
|
- bn |
|
- ta |
|
- as |
|
- gu |
|
- kn |
|
- ks |
|
- ml |
|
- mr |
|
- ne |
|
- or |
|
- pa |
|
- sa |
|
- sd |
|
- te |
|
- ur |
|
license: apache-2.0 |
|
--- |
|
|
|
## MuRIL - Unofficial |
|
|
|
Multilingual Representations for Indian Languages : Google open sourced |
|
this BERT model pre-trained on 17 Indian languages, and their transliterated |
|
counterparts. |
|
|
|
The model was trained using a self-supervised masked language modeling task. We do whole word masking with a maximum of 80 predictions. The model was trained for 1000K steps, with a batch size of 4096, and a max sequence length of 512. |
|
|
|
Details: https://tfhub.dev/google/MuRIL/1 |
|
|
|
License: Apache 2.0 |
|
|
|
### About this upload |
|
|
|
I ported the TFHub .pb model to .h5 and then pytorch_model.bin for |
|
compatibility with Transformers. |
|
|