Fill-Mask
Transformers
Safetensors
Japanese
English
modernbert
Inference Endpoints
hpprc commited on
Commit
ca9f86e
·
verified ·
1 Parent(s): a28761f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -181,6 +181,7 @@ For datasets with predefined `train`, `validation`, and `test` sets, we simply t
181
  The evaluation results are shown in the table.
182
  `#Param.` represents the number of parameters in both the input embedding layer and the Transformer layers, while `#Param. w/o Emb.` indicates the number of parameters in the Transformer layers only.
183
 
 
184
  Despite being a long-context model capable of processing sequences of up to 8,192 tokens, our ModernBERT-Ja-310M also exhibited strong performance in short-sequence evaluations.
185
 
186
  ## Ethical Considerations
 
181
  The evaluation results are shown in the table.
182
  `#Param.` represents the number of parameters in both the input embedding layer and the Transformer layers, while `#Param. w/o Emb.` indicates the number of parameters in the Transformer layers only.
183
 
184
+ According to our evaluation results, our ModernBERT-Ja-310M archives **state-of-the-art** performance across the evaluation tasks, even when compared with much larger models.
185
  Despite being a long-context model capable of processing sequences of up to 8,192 tokens, our ModernBERT-Ja-310M also exhibited strong performance in short-sequence evaluations.
186
 
187
  ## Ethical Considerations