Spaces:
Running
Running
title: README | |
emoji: 📊 | |
colorFrom: green | |
colorTo: red | |
sdk: static | |
pinned: false | |
license: afl-3.0 | |
Useful HF resources and contributors for Dutch NLP are | |
## Individuals | |
* [Pieter Delobelle](https://huggingface.co./pdelobelle), [homepage](https://pieter.ai/) and [git](https://github.com/ipieter) | |
* [Bram van Roy](https://huggingface.co./BramVanroy) and [homepage](https://bramvanroy.github.io/) | |
* [Robin Smits](https://huggingface.co./robinsmits) and [git](https://github.com/robinsmits) | |
* [Janneke van de Zwaan](https://huggingface.co./jvdzwaan/ocrpostcorrection-task-1) and [git](https://github.com/jvdzwaan) | |
* [Yeb Havinga](https://huggingface.co./yhavinga) and [git](https://github.com/yhavinga) | |
* [Wietse de Vries](https://huggingface.co./wietsedv) and [git](https://github.com/wietsedv) | |
* [François Remy](https://huggingface.co./FremyCompany), [homepage](http://fremycompany.com) and [git](https://github.com/FremyCompany) | |
* [Maarten Grootendorst](https://huggingface.co./MaartenGr), [homepage](https://www.maartengrootendorst.com/) and [git](https://github.com/MaartenGr) | |
* [Piek Vossen](https://vossen.info/) | |
## Organisations | |
* [University Medical Center Utrecht](https://github.com/umcu) | |
* [NLPtown](https://huggingface.co./nlptown) and [homepage](http://nlp.town/) | |
* [doc2query](https://huggingface.co./doc2query) | |
* [LT3, language and translation technology team, University of Gent](https://huggingface.co./LT3) and [homepage](https://lt3.ugent.be/) | |
* [Textgain](https://huggingface.co./textgain) and [homepage](https://www.textgain.com/) | |
* [ML6](https://huggingface.co./ml6team), [homepage](https://ml6.eu/) and [git](https://github.com/ml6team) | |
* [CLiPS](https://huggingface.co./clips), [homepage](https://www.uantwerpen.be/en/research-groups/clips/) and [git](https://github.com/clips) | |
* [DTAI Research Group, KU Leuven](https://huggingface.co./DTAI-KULeuven), [homepage](https://dtai.cs.kuleuven.be/) and [git](https://github.com/ML-KULeuven) | |
* [GroNLP](https://huggingface.co./GroNLP), [homepage](https://www.rug.nl/research/clcg/research/cl/) | |
* [CLTL](https://huggingface.co./CLTL), [homepage](http://cltl.nl) and [git](https://github.com/CLTL) | |
* [Nederlands Forensic Institute](https://huggingface.co./NetherlandsForensicInstitute), [homepage](https://forensicinstitute.nl/) and [git](https://github.com/NetherlandsForensicInstitute) | |
* [Integraal Kanker centrum Nederland (iKNL)](https://github.com/iknl) | |
* [Erasmus Medical Informatics](https://github.com/mi-erasmusmc) | |
## NLP Libraries relevant for (Dutch) clinical NLP: | |
* [Clinlp](https://github.com/umcu/clinlp) | |
## Encoder models | |
* [RobBERT v2](https://huggingface.co./pdelobelle/robbert-v2-dutch-base) | |
* [BERTje](https://huggingface.co./GroNLP/bert-base-dutch-cased) | |
* [BelabBERT](https://huggingface.co./jwouts/belabBERT_115k) | |
* [MedRoBERTa.nl](https://huggingface.co./CLTL/MedRoBERTa.nl) | |
## Decoder models | |
* [GPT-2 on mC4](https://huggingface.co./yhavinga/gpt2-large-dutch), [GPT-2 finetuned on ](https://huggingface.co./GroNLP/gpt2-medium-dutch-embeddings) | |
* [GPT-neo on mC4](https://huggingface.co./yhavinga/gpt-neo-1.3B-dutch) | |
* [GEITje (based on Mistral)](https://github.com/Rijgersberg/GEITje) | |
## NTMs | |
* [NLLB200](https://huggingface.co./facebook/nllb-200-3.3B) | |
* [UL2, en-nl](https://huggingface.co./yhavinga/ul2-large-en-nl), [UL2, nl-en](https://huggingface.co./yhavinga/ul2-large-dutch-english) | |
* [OPUS MT, en-nl](https://huggingface.co./Helsinki-NLP/opus-mt-en-nl), [OPUS MT, nl-en](https://huggingface.co./Helsinki-NLP/opus-mt-nl-en), [OPUS MT Healthcare, nl-en](https://huggingface.co./FremyCompany/opus-mt-nl-en-healthcare) | |
* [Llama 2 MT, nl-en](https://huggingface.co./kaitchup/Llama-2-7b-mt-Dutch-to-English) | |
## Datasets | |
* [SoNaR](https://taalmaterialen.ivdnt.org/download/tstc-sonar-corpus/) | |
* [COW](https://rolandschaefer.net/archives/142) | |
* [mc4 cleaned](https://huggingface.co./datasets/yhavinga/mc4_nl_cleaned) | |
* [TWnC](https://research.utwente.nl/en/publications/twnc-a-multifaceted-dutch-news-corpus) | |
* [Gigacorpus](http://gigacorpus.nl/) |