LaBSE-Malach-Multilabel
A multilabel text classification model fine-tuned on an English subset (Malach ASR) of the Visual History Archive. Based on LaBSE pretrained weights but it uses the general Hugging Face framework, not sentence-transformers. Input text segments consisted of ~350 words on average.
Given an input string, the model predicts probablites for 1063 keyword IDs from the VHA ontology, sorted by probability. Typically, probabilities >= 0.5 are "True" if encoding them in a binary vector.
The mapping from keyword IDs to labels will be added to the repository.
- Downloads last month
- 18
Inference API (serverless) is not available, repository is disabled.