Easy German GPT2 Model

A language model for german easy language ("leichte Sprache") based on German GPT-2 model

Model Details

Initialized using the weights of German GPT-2 model.
Then fine-tuned for one epoch on "leichte Sprache" corpora consisting of:

encyclopedia like data
news like data

Hyperparamters used for fine-tuning:

tokenizer:
- max_length: 1024 (but trained with dynamic length, using the collator functions 'pad_to_multiple_of=8')
- stride: 64
- return_overflowing_tokens=True
training arguments
- num_train_epochs=1
- learning_rate=1e-3
- weight_decay=0.01
- per_device_train_batch_size=4
- gradient_accumulation_steps=4
- warmup_steps=200
- fp16=True

→ 25112 training items, trained on google colab GPU (30 min)

Evaluation results

The perplexity value is calculated based on an unseen dataset containing manually aligned standard german and "leichte Sprache" texts.
For calculation, the method described in this tutorial was used with the following values:

max_length = 512
stride = 256

For comparison: running the modified function on this example gives us a perplexity score of 18.2551

Model	Perplexity "leichte Sprache" (Easy MDR News))	Perplexity standard german (Standard MDR News)
German GPT-2 model	23.8257	24.0301
our model	17.3053	48.6314