--- tags: - flair - token-classification - sequence-tagger-model language: de --- # REDEWIEDERGABE Tagger: reported STWR This model is part of an ensemble of binary taggers that recognize German speech, thought and writing representation (STWR), that is being used in [LLpro](https://github.com/cophi-wue/LLpro). They can be used to automatically detect and annotate the following 4 types of speech, thought and writing representation in German texts: | STWR type | Example | Translation | |--------------------------------|-------------------------------------------------------------------------|----------------------------------------------------------| | direct | Dann sagte er: **"Ich habe Hunger."** | Then he said: **"I'm hungry."** | | free indirect ('erlebte Rede') | Er war ratlos. **Woher sollte er denn hier bloß ein Mittagessen bekommen?** | He was at a loss. **Where should he ever find lunch here?** | | indirect | Sie fragte, **wo das Essen sei.** | She asked **where the food was.** | | reported (**this tagger**) | **Sie sprachen über das Mittagessen.** | **They talked about lunch.** | The ensemble is trained on the [REDEWIEDERGABE corpus](https://github.com/redewiedergabe/corpus) ([Annotation guidelines](http://redewiedergabe.de/richtlinien/richtlinien.html)), fine-tuning each tagger on the domain-adapted [lkonle/fiction-gbert-large](https://huggingface.co./lkonle/fiction-gbert-large). ([Training Code](https://github.com/cophi-wue/LLpro/blob/main/contrib/train_redewiedergabe.py)) **F1-Scores:** | STWR type | F1-Score | |-----------|-----------| | direct | 90.76 | | indirect | 79.16 | | free indirect | 58.00 | | **reported (this tagger)** | **70.47** | ---- **Demo Usage:** ```python from flair.data import Sentence from flair.models import SequenceTagger sentence = Sentence('Sie sprachen über das Mittagessen. Sie fragte, wo das Essen sei. Woher sollte er das wissen? Dann sagte er: "Ich habe Hunger."') rwtypes = ['direct', 'indirect', 'freeindirect', 'reported'] for rwtype in rwtypes: model = SequenceTagger.load(f'aehrm/redewiedergabe-{rwtype}') model.predict(sentence) print(rwtype, [ x.data_point.text for x in sentence.get_labels() ]) # >>> direct ['"', 'Ich', 'habe', 'Hunger', '.', '"'] # >>> indirect ['wo', 'das', 'Essen', 'sei', '.'] # >>> freeindirect ['Woher', 'sollte', 'er', 'das', 'wissen', '?'] # >>> reported ['Sie', 'sprachen', 'über', 'das', 'Mittagessen', '.', 'Woher', 'sollte', 'er', 'das', 'wissen', '?'] ``` **Cite**: Please cite the following paper when using this model. ``` @inproceedings{ehrmanntraut-et-al-llpro-2023, address = {Ingolstadt, Germany}, title = {{LLpro}: A Literary Language Processing Pipeline for {German} Narrative Text}, booktitle = {Proceedings of the 10th Conference on Natural Language Processing ({KONVENS} 2022)}, publisher = {{KONVENS} 2023 Organizers}, author = {Ehrmanntraut, Anton and Konle, Leonard and Jannidis, Fotis}, year = {2023}, } ```