File size: 3,581 Bytes
83803d8 7c40a5e 83803d8 8d00d0c 9942eea 83803d8 8d00d0c 83803d8 8d00d0c 83803d8 6909931 9192ec2 c295203 9192ec2 c295203 d8fd901 83803d8 3e2b32c 83803d8 3e2b32c 83803d8 3e2b32c 83803d8 3e2b32c 83803d8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
---
tags:
- autotrain
- text-classification
- lam
- metadata
language:
- it
widget:
- text: porta a due battenti.Figure:putti.Animali:aquila.Decorazioni
- text: Elemento di decorazione architettonica a rilievo
datasets:
- biglam/cultural_heritage_metadata_accuracy
co2_eq_emissions:
emissions: 7.171395981202868
metrics:
- f1
- accuracy
- recall
pipeline_tag: text-classification
license: mit
library_name: transformers
---
# Model Card for Cultural Heritage Metadata Accuracy Detection model
This model is trained to detect the quality of Italian cultural heritage metadata, assigning a score of `high quality` or `low quality` to input text.
The model was trained on the [Annotated dataset to assess the accuracy of the textual description of cultural heritage records](https://huggingface.co./datasets/biglam/cultural_heritage_metadata_accuracy) dataset.
>The dataset contains more than 100K textual descriptions of cultural items from Cultura Italia, the Italian National Cultural aggregator. Each of the description is labeled either HIGH or LOW quality, according its adherence to the standard cataloguing guidelines provided by Istituto Centrale per il Catalogo e la Documentazione (ICCD). More precisely, each description is labeled as HIGH quality if the object and subject of the item (for which the description is provided) are both described according to the ICCD guidelines, and as LOW quality in all other cases. Most of the dataset was manually annotated, with ~30K descriptions automatically labeled as LOW quality due to their length (less than 3 tokens) or their provenance from old (pre-2012), not curated, collections. The dataset was developed to support the training and testing of ML text classification approaches for automatically assessing the quality of textual descriptions in digital Cultural Heritage repositories.
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
This model could potentially be useful for performing validation on metadata quality. However, before using this model, it would be sensible to validate:
- how it performs on your data
- if you agree with the quality ratings assigned in the original dataset.
It will likely make more sense to use this model in the context of a 'human in the loop' pipeline whereby the model is used to surface metadata records which may benefit from additional human attention rather than using it to make automatic decisions.
# Model Trained Using AutoTrain
- Problem type: Multi-class Classification
- Model ID: 48840118272
- CO2 Emissions (in grams): 7.1714
## Validation Metrics
- Loss: 0.085
- Accuracy: 0.972
- Macro F1: 0.972
- Micro F1: 0.972
- Weighted F1: 0.972
- Macro Precision: 0.972
- Micro Precision: 0.972
- Weighted Precision: 0.972
- Macro Recall: 0.972
- Micro Recall: 0.972
- Weighted Recall: 0.972
## Usage
You can use cURL to access this model:
```
$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "Elemento di decorazione architettonica a rilievo"}' https://api-inference.huggingface.co/models/davanstrien/autotrain-cultural_heritage_metadata_accuracy-48840118272
```
You can also use the model locally be leveraging a Transformers [pipeline](https://huggingface.co./docs/transformers/pipeline_tutorial)
```
from transformers import pipeline
pipe = pipeline('text-classification', model='biglam/cultural_heritage_metadata_accuracy')
pipe("Elemento di decorazione architettonica a rilievo")
``` |