language: | |
- en | |
- hi | |
- multilingual | |
tags: | |
- generated_from_trainer | |
licence: cc-by-sa-4.0 | |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You | |
should probably proofread and complete it, then remove this comment. --> | |
# muril-en-hi-codemixed | |
muril-en-hi-codemixed is a masked language model, based on the [MuRIL](https://huggingface.co./google/muril-base-cased) multilingual model. | |
muril-en-hi-codemixed replaces the tokenizer, vocabulary and the embeddings layer of the MuRIL model. | |
The tokenizer and vocabulary used are the same as in the [roberta-en-hi-codemixed](https://huggingface.co./cjvt/roberta-en-hi-codemixed) model. | |
The new embedding weights were initialized from the MuRIL embeddings. | |
The new muril-en-hi-codemixed model was further pre-trained for two epochs on the same codemixed English and Hindi corpora | |
as the [roberta-en-hi-codemixed](https://huggingface.co./cjvt/roberta-en-hi-codemixed) model. | |