--- tags: - autotrain - vision - image-classification datasets: - ernie-ai/autotrain-data-document-text-language-ar-en-zh widget: - src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/tiger.jpg example_title: Tiger - src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/teapot.jpg example_title: Teapot - src: https://huggingface.co./datasets/mishig/sample_images/resolve/main/palace.jpg example_title: Palace co2_eq_emissions: emissions: 2.2266908460523576 --- # finetuned-MS-swin-doc-text-classifer This model is a fine-tuned version of Microsoft’s Swin Transformer tiny-sized model [microsoft/swin-tiny-patch4-window7-224](https://huggingface.co./microsoft/swin-tiny-patch4-window7-224) on the ernie-ai/image-text-examples-ar-cn-latin-notext dataset. It achieves the following results on the evaluation set: - Loss: 0.267 - Accuracy: 0.882 ## Model description It is an image classificatin model fine-tuned to predict whether an images contains text and if that text is Latin script, Chinese or Arabic. It also classifies non-text images. ## Training and evaluation data Dataset: [ernie-ai/image-text-examples-ar-cn-latin-notext] # Model Trained Using AutoTrain - Problem type: Multi-class Classification - Model ID: 3338392240 - CO2 Emissions (in grams): 2.2267 ## Validation Metrics - Loss: 0.267 - Accuracy: 0.882 - Macro F1: 0.862 - Micro F1: 0.882 - Weighted F1: 0.880 - Macro Precision: 0.877 - Micro Precision: 0.882 - Weighted Precision: 0.883 - Macro Recall: 0.856 - Micro Recall: 0.882 - Weighted Recall: 0.882