Easy German GPT2 Model
A language model for german easy language ("leichte Sprache") based on German GPT-2 model
Model Details
Initialized using the weights of German GPT-2 model.
Then fine-tuned for one epoch on "leichte Sprache" corpora consisting of:
- encyclopedia like data
- news like data
Hyperparamters used for fine-tuning:
tokenizer:
- max_length: 1024 (but trained with dynamic length, using the collator functions 'pad_to_multiple_of=8')
- stride: 64
- return_overflowing_tokens=True
training arguments
- num_train_epochs=1
- learning_rate=1e-3
- weight_decay=0.01
- per_device_train_batch_size=4
- gradient_accumulation_steps=4
- warmup_steps=200
- fp16=True
→ 25112 training items, trained on google colab GPU (30 min)
Evaluation results
The perplexity value is calculated based on an unseen dataset containing manually aligned standard german and "leichte Sprache" texts.
For calculation, the method described in this tutorial was used with the following values:
- max_length = 512
- stride = 256
For comparison: running the modified function on this example gives us a perplexity score of 18.2551
Model | Perplexity "leichte Sprache" (Easy MDR News)) | Perplexity standard german (Standard MDR News) |
---|---|---|
German GPT-2 model | 23.8257 | 24.0301 |
our model | 17.3053 | 48.6314 |
- Downloads last month
- 111
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.