Stefan Schweter PRO
stefan-it
AI & ML interests
Flair Library, NER & PoS Tagging, LM Pretraining (mostly encoder-only), Historical Language Models
Recent Activity
upvoted
an
article
about 10 hours ago
FineWeb2-C: Help Build Better Language Models in Your Language
upvoted
a
paper
about 15 hours ago
GEITje 7B Ultra: A Conversational Model for Dutch
upvoted
a
paper
about 20 hours ago
BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language
Models
Articles
Organizations
Posts
1
Post
1120
My latest project is the outcome of the last 2+ years working with TPUs from the amazing TPU Research Cloud (TRC) program and training Encoder-only LMs with the TensorFlow Model Garden library.
👉 Link: https://github.com/stefan-it/model-garden-lms
An overview of some features:
- Cheatsheet for setting-up a TPU VM Pod (with all necessary dependencies) to pretrain LMs with TF Model Garden
- Conversion scripts that convert TF Model Garden weights to Hugging Face Transformers-compatible models
- Supported architectures include BERT, BERT with Token Dropping and TEAMS
I also released BERT-based models pretrained on the great Hugging Face FineWeb and FineWeb-Edu datasets (10BT subset). With more to come!
👉 Model Hub Link: https://huggingface.co./model-garden-lms
If you find these resources useful, please give them a like!
Made from Bavarian Oberland with ❤️ and 🥨.
👉 Link: https://github.com/stefan-it/model-garden-lms
An overview of some features:
- Cheatsheet for setting-up a TPU VM Pod (with all necessary dependencies) to pretrain LMs with TF Model Garden
- Conversion scripts that convert TF Model Garden weights to Hugging Face Transformers-compatible models
- Supported architectures include BERT, BERT with Token Dropping and TEAMS
I also released BERT-based models pretrained on the great Hugging Face FineWeb and FineWeb-Edu datasets (10BT subset). With more to come!
👉 Model Hub Link: https://huggingface.co./model-garden-lms
If you find these resources useful, please give them a like!
Made from Bavarian Oberland with ❤️ and 🥨.
Collections
14
My pretrained LMs on FineWeb datasets - part of my TensorFlow Model Garden LMs project
A Collection of Historical Multilingual Language Models
-
dbmdz/bert-base-historic-multilingual-cased
Fill-Mask • Updated • 81 • 6 -
dbmdz/bert-base-historic-multilingual-64k-td-cased
Fill-Mask • Updated • 89 • 1 -
hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax
Text2Text Generation • Updated • 29 -
hmteams/teams-base-historic-multilingual-discriminator
Updated • 16
models
1333
stefan-it/span-marker-gelectra-large-germeval14
Token Classification
•
Updated
•
2.07k
•
2
stefan-it/zeitungs-lm-v1
Updated
•
108
•
3
stefan-it/wav2vec2-large-xlsr-53-basque
Automatic Speech Recognition
•
Updated
•
32
stefan-it/german-gpt2-larger
Text Generation
•
Updated
•
486
•
8
stefan-it/xlstm-german-wikipedia
Text Generation
•
Updated
•
184
•
7
stefan-it/flair-barner-wiki-coarse-gbert-large
Token Classification
•
Updated
•
9
•
1
stefan-it/flair-clean-conll-5
Token Classification
•
Updated
•
14
stefan-it/flair-clean-conll-4
Token Classification
•
Updated
•
3
stefan-it/flair-clean-conll-3
Token Classification
•
Updated
•
8
stefan-it/flair-clean-conll-2
Token Classification
•
Updated
•
3
datasets
12
stefan-it/senti-anno
Viewer
•
Updated
•
929
•
113
stefan-it/offenseval2020_tr
Viewer
•
Updated
•
35.3k
•
338
stefan-it/dewiki-20230701-nltk-corpus
Viewer
•
Updated
•
39.4M
•
64
•
2
stefan-it/germeval14_no_wikipedia
Preview
•
Updated
•
68
stefan-it/histnero
Viewer
•
Updated
•
217k
•
55
stefan-it/HisGermaNER
Preview
•
Updated
•
186
•
2
stefan-it/co-funer
Preview
•
Updated
•
85
stefan-it/german-dbmdz-bert-corpus
Viewer
•
Updated
•
52.8M
•
108
•
2
stefan-it/span-marker-base-model-detection
Viewer
•
Updated
•
28
•
48
stefan-it/flair-base-model-detection
Viewer
•
Updated
•
52
•
39
•
1