Topical is a small language model specialized for topic extraction. Given a document Pleias-Topic-Deduction will return a main topic that can be used for further downstream tasks (annotation, embedding indexation)

Like other model from PleIAs Bad Data Toolbox, Topical has been volontarily trained on 70,000 documents extracted from Common Corpus with a various range of digitization artifact.

Topical is a lightweight model (70 million parameters) tha can be especially used for classification at scale on a large corpus.

Example

Downloads last month
34
Safetensors
Model size
60.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for PleIAs/Topical

Base model

google-t5/t5-small
Finetuned
(1621)
this model

Collection including PleIAs/Topical