File size: 1,577 Bytes
250d708 7b9720d d9dfe10 250d708 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
---
tags:
- autotrain
- vision
- image-classification
datasets:
- ernie-ai/autotrain-data-document-text-language-ar-en-zh
widget:
- src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/tiger.jpg
example_title: Tiger
- src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/teapot.jpg
example_title: Teapot
- src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/palace.jpg
example_title: Palace
co2_eq_emissions:
emissions: 2.2266908460523576
---
# finetuned-MS-swin-doc-text-classifer
This model is a fine-tuned version of Microsoft’s Swin Transformer tiny-sized model [microsoft/swin-tiny-patch4-window7-224](https://huggingface.co./microsoft/swin-tiny-patch4-window7-224) on the ernie-ai/image-text-examples-ar-cn-latin-notext dataset.
It achieves the following results on the evaluation set:
- Loss: 0.267
- Accuracy: 0.882
## Model description
It is an image classificatin model fine-tuned to predict whether an images contains text and if that text is Latin script, Chinese or Arabic. It also classifies non-text images.
## Training and evaluation data
Dataset: [ernie-ai/image-text-examples-ar-cn-latin-notext]
# Model Trained Using AutoTrain
- Problem type: Multi-class Classification
- Model ID: 3338392240
- CO2 Emissions (in grams): 2.2267
## Validation Metrics
- Loss: 0.267
- Accuracy: 0.882
- Macro F1: 0.862
- Micro F1: 0.882
- Weighted F1: 0.880
- Macro Precision: 0.877
- Micro Precision: 0.882
- Weighted Precision: 0.883
- Macro Recall: 0.856
- Micro Recall: 0.882
- Weighted Recall: 0.882 |