This is a RoBERTa-base model trained from scratch in Spanish.

The training dataset is mc4 subsampling documents to a total of about 50 million examples. Sampling is biased towards average perplexity values (defining perplexity boundaries based on quartiles), discarding more often documents with very large values (Q4, poor quality) of very small values (Q1, short, repetitive texts).

This model has been trained for 180.000 steps (early stopped from 250k intended steps).

Please see our main card for more information.

This is part of the Flax/Jax Community Week, organised by HuggingFace and TPU usage sponsored by Google.

Team members

Downloads last month
32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.