File size: 1,577 Bytes
250d708
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7b9720d
d9dfe10
 
 
 
 
 
 
 
 
 
 
 
 
250d708
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
tags:
- autotrain
- vision
- image-classification
datasets:
- ernie-ai/autotrain-data-document-text-language-ar-en-zh
widget:
- src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/tiger.jpg
  example_title: Tiger
- src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/teapot.jpg
  example_title: Teapot
- src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/palace.jpg
  example_title: Palace
co2_eq_emissions:
  emissions: 2.2266908460523576
---
# finetuned-MS-swin-doc-text-classifer

This model is a fine-tuned version of Microsoft’s Swin Transformer tiny-sized model [microsoft/swin-tiny-patch4-window7-224](https://huggingface.co./microsoft/swin-tiny-patch4-window7-224) on the ernie-ai/image-text-examples-ar-cn-latin-notext dataset.
It achieves the following results on the evaluation set:
- Loss: 0.267
- Accuracy: 0.882

## Model description

It is an image classificatin model fine-tuned to predict whether an images contains text and if that text is Latin script, Chinese or Arabic. It also classifies non-text images.

## Training and evaluation data

Dataset: [ernie-ai/image-text-examples-ar-cn-latin-notext]

# Model Trained Using AutoTrain

- Problem type: Multi-class Classification
- Model ID: 3338392240
- CO2 Emissions (in grams): 2.2267

## Validation Metrics

- Loss: 0.267
- Accuracy: 0.882
- Macro F1: 0.862
- Micro F1: 0.882
- Weighted F1: 0.880
- Macro Precision: 0.877
- Micro Precision: 0.882
- Weighted Precision: 0.883
- Macro Recall: 0.856
- Micro Recall: 0.882
- Weighted Recall: 0.882