hubert-tiny-wiki

This model was trained from scratch on the Wikipedia subset of Hungarian Webcorpus 2.0 with MLM and SOP tasks.

Pre-Training Parameters:

First phase:

Training steps: 500.000
Sequence length: 128
Batch size: 1024

Second phase:

Training steps: 100.000
Sequence length: 512
Batch size: 384

Framework versions

Transformers 4.21.3
TensorFlow 2.10.0
Datasets 2.4.0
Tokenizers 0.12.1

Acknowledgement

Downloads last month: 52

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Dataset used to train SzegedAI/hubertusz-tiny-wiki

Evaluation results

Metadata error: specify a dataset to view leaderboard