indiejoseph commited on
Commit
6284427
1 Parent(s): ad9a733

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -25,9 +25,9 @@ should probably proofread and complete it, then remove this comment. -->
25
 
26
  # bart-base-cantonese
27
 
28
- This model is a continue pre-train version of [fnlp/bart-base-chinese](https://huggingface.co/fnlp/bart-base-chinese) on filtered Cantonese common crawl dataset with 950M tokens.
29
 
30
- This tokenizer has extended the Bert tokenizer from fnlp/bart-base-chinese with 500 more Chinese characters commonly found in Cantonese
31
 
32
  ## Intended uses & limitations
33
 
 
25
 
26
  # bart-base-cantonese
27
 
28
+ This model is a continue pre-train version of [fnlp/bart-base-chinese](https://huggingface.co/fnlp/bart-base-chinese) on filtered Cantonese common crawl dataset with 472M tokens.
29
 
30
+ This tokenizer has extended the Bert tokenizer from fnlp/bart-base-chinese with 100 more Chinese characters commonly found in Cantonese
31
 
32
  ## Intended uses & limitations
33