kelechi commited on
Commit
d39b301
1 Parent(s): 51f514b

updated model card

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -15,7 +15,7 @@ language:
15
  - multilingual
16
 
17
  ---
18
- # AfriBERTa_small
19
  ## Model description
20
  AfriBERTa small is a pretrained multilingual language model with around 97 million parameters.
21
  The model has 4 layers, 6 attention heads, 768 hidden units and 3072 feed forward size.
@@ -33,13 +33,13 @@ For example, assuming we want to finetune this model on a token classification t
33
  >>> from transformers import AutoTokenizer, AutoModelForTokenClassification
34
  >>> model = AutoModelForTokenClassification.from_pretrained("castorini/afriberta_small")
35
  >>> tokenizer = AutoTokenizer.from_pretrained("castorini/afriberta_small")
36
- # we have to manually set the model max length because it is an imported sentencepiece model which hugginface does not properly support right now
37
  >>> tokenizer.model_max_length = 512
38
  ```
39
 
40
  #### Limitations and bias
41
- This model is possibly limited by its training dataset which are majorly obtained from news articles from a specific span of time.
42
- Thus, it may not generalize well.
43
 
44
  ## Training data
45
  The model was trained on an aggregation of datasets from the BBC news website and Common Crawl.
 
15
  - multilingual
16
 
17
  ---
18
+ # afriberta_small
19
  ## Model description
20
  AfriBERTa small is a pretrained multilingual language model with around 97 million parameters.
21
  The model has 4 layers, 6 attention heads, 768 hidden units and 3072 feed forward size.
 
33
  >>> from transformers import AutoTokenizer, AutoModelForTokenClassification
34
  >>> model = AutoModelForTokenClassification.from_pretrained("castorini/afriberta_small")
35
  >>> tokenizer = AutoTokenizer.from_pretrained("castorini/afriberta_small")
36
+ # we have to manually set the model max length because it is an imported trained sentencepiece model, which huggingface does not properly support right now
37
  >>> tokenizer.model_max_length = 512
38
  ```
39
 
40
  #### Limitations and bias
41
+ - This model is possibly limited by its training dataset which are majorly obtained from news articles from a specific span of time. Thus, it may not generalize well.
42
+ - This model is trained on very little data (less than 1 GB), hence it may not have seen enough data to learn very complex linguistic relations.
43
 
44
  ## Training data
45
  The model was trained on an aggregation of datasets from the BBC news website and Common Crawl.