UMCU commited on
Commit
79f6d33
·
verified ·
1 Parent(s): 27c2cc8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -17,8 +17,8 @@ Continued, on-premise, pre-training of [MedRoBERTa.nl](https://huggingface.co/CL
17
 
18
  # Data statistics
19
 
20
- * Number of tokens: 1.47B
21
- * Number of documents: 5.8M
22
  * Average number of tokens per document: 253
23
  * Median number of tokens per document: 124
24
 
 
17
 
18
  # Data statistics
19
 
20
+ * Number of tokens: 1.47B, of which 1B from UMCU EHRs
21
+ * Number of documents: 5.8M, of which 3.5M UMCU EHRs
22
  * Average number of tokens per document: 253
23
  * Median number of tokens per document: 124
24