File size: 2,444 Bytes
cbcc293 26477c5 cbcc293 26477c5 f76a0b2 26477c5 6d5a4fc 26477c5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
---
title: README
emoji: 📚
colorFrom: indigo
colorTo: purple
sdk: static
pinned: false
---
# hmBERT
Historical Multilingual Language Models for Named Entity Recognition. The following languages are covered by hmBERT:
* English (British Library Corpus - Books)
* German (Europeana Newspaper)
* French (Europeana Newspaper)
* Finnish (Europeana Newspaper)
* Swedish (Europeana Newspaper)
More details can be found in [our GitHub repository](https://github.com/dbmdz/clef-hipe) and in our
[hmBERT paper](https://ceur-ws.org/Vol-3180/paper-87.pdf).
# Leaderboard
We test our pretrained language models on various datasets from HIPE-2020, HIPE-2022 and Europeana.
The following table shows an overview of used datasets:
| Language | Datasets |
|----------|------------------------------------------------------------------|
| English | [AjMC] - [TopRes19th] |
| German | [AjMC] - [NewsEye] - [HIPE-2020] |
| French | [AjMC] - [ICDAR-Europeana] - [LeTemps] - [NewsEye] - [HIPE-2020] |
| Finnish | [NewsEye] |
| Swedish | [NewsEye] |
| Dutch | [ICDAR-Europeana] |
[AjMC]: https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-ajmc.md
[NewsEye]: https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-newseye.md
[TopRes19th]: https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-topres19th.md
[ICDAR-Europeana]: https://github.com/stefan-it/historic-domain-adaptation-icdar
[LeTemps]: https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-letemps.md
[HIPE-2020]: https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-hipe2020.md
All results can be found in the [`hmLeaderboard`](https://huggingface.co./spaces/hmbench/hmLeaderboard).
# Acknowledgements
We thank [Luisa März](https://github.com/LuisaMaerz), [Katharina Schmid](https://github.com/schmika) and
[Erion Çano](https://github.com/erionc) for their fruitful discussions about Historical Language Models.
Research supported with Cloud TPUs from Google's [TPU Research Cloud](https://sites.research.google/trc/about/) (TRC).
Many Thanks for providing access to the TPUs ❤️ |