A small version of DeBERTa
trained on the clean version of google C4 dataset. For more info about the size of the model, see config.json
.
The model has been trained for 100K steps with a batch size of 2048 and a sequence length of 512, for a total of 104B tokens.
The vocabulary and the tokenizer are the same as microsoft/deberta-base
.
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.