ufal
/

robeczech-base

@@ -14,9 +14,10 @@ tags:
 ## Version History
 - **version 1.1**: Version 1.1 was released in Jan 2024, with a change to the
-  tokenizer; the model parameters were mostly kept the same, but the embeddings
-  were enlarged (by copying suitable rows) to correspond to the updated
-  tokenizer.
   The tokenizer in the initial release (a) contained a hole (51959 did not
   correspond to any token), and (b) mapped several tokens (unseen during training
@@ -29,8 +30,9 @@ tags:
   mapping all tokens to a unique ID. That also required increasing the
   vocabulary size and embeddings weights (by replicating the embedding of the
   `[UNK]` token). Without finetuning, version 1.1 and version 1.0 gives exactly
-  the same results on any input, and the tokens in version 1.0 that mapped to
-  a different ID than the `[UNK]` token map to the same ID in version 1.1.
   However, the sizes of the embeddings (and LM head weights and biases) are
   different, so the weights of the version 1.1 are not compatible with the

 ## Version History
 - **version 1.1**: Version 1.1 was released in Jan 2024, with a change to the
+  tokenizer described below; the model parameters were mostly kept the same, but
+  (a) the embeddings were enlarged (by copying suitable rows) to correspond to
+  the updated tokenizer, (b) the pooler was dropped (originally it was only
+  randomly initialized).
   The tokenizer in the initial release (a) contained a hole (51959 did not
   correspond to any token), and (b) mapped several tokens (unseen during training
   mapping all tokens to a unique ID. That also required increasing the
   vocabulary size and embeddings weights (by replicating the embedding of the
   `[UNK]` token). Without finetuning, version 1.1 and version 1.0 gives exactly
+  the same embeddings on any input (apart from the pooler missing in v1.1),
+  and the tokens in version 1.0 that mapped to a different ID than the `[UNK]`
+  token map to the same ID in version 1.1.
   However, the sizes of the embeddings (and LM head weights and biases) are
   different, so the weights of the version 1.1 are not compatible with the