Update README.md
Browse files
README.md
CHANGED
@@ -17,8 +17,8 @@ Continued, on-premise, pre-training of [MedRoBERTa.nl](https://huggingface.co/CL
|
|
17 |
|
18 |
# Data statistics
|
19 |
|
20 |
-
* Number of tokens: 1.47B
|
21 |
-
* Number of documents: 5.8M
|
22 |
* Average number of tokens per document: 253
|
23 |
* Median number of tokens per document: 124
|
24 |
|
|
|
17 |
|
18 |
# Data statistics
|
19 |
|
20 |
+
* Number of tokens: 1.47B, of which 1B from UMCU EHRs
|
21 |
+
* Number of documents: 5.8M, of which 3.5M UMCU EHRs
|
22 |
* Average number of tokens per document: 253
|
23 |
* Median number of tokens per document: 124
|
24 |
|