German Wikipedia LMs

non-profit

AI & ML interests

language modeling

Recent Activity

stefan-itĀ  updated a model 24 days ago
gwlms/deberta-tokenizer
stefan-itĀ  updated a model 28 days ago
gwlms/roberta-tokenizer
stefan-itĀ  updated a dataset 7 months ago
gwlms/dewiki-20230701-flair-corpus
View all activity

gwlms's activity

stefan-itĀ 
posted an update 16 days ago
view post
Post
1127
My latest project is the outcome of the last 2+ years working with TPUs from the amazing TPU Research Cloud (TRC) program and training Encoder-only LMs with the TensorFlow Model Garden library.

šŸ‘‰ Link: https://github.com/stefan-it/model-garden-lms

An overview of some features:

- Cheatsheet for setting-up a TPU VM Pod (with all necessary dependencies) to pretrain LMs with TF Model Garden
- Conversion scripts that convert TF Model Garden weights to Hugging Face Transformers-compatible models
- Supported architectures include BERT, BERT with Token Dropping and TEAMS

I also released BERT-based models pretrained on the great Hugging Face FineWeb and FineWeb-Edu datasets (10BT subset). With more to come!

šŸ‘‰ Model Hub Link: https://huggingface.co./model-garden-lms

If you find these resources useful, please give them a like!

Made from Bavarian Oberland with ā¤ļø and šŸ„Ø.
stefan-itĀ 
updated a Space about 1 year ago