ernie-ai's picture
Update README.md
7b9720d
metadata
tags:
  - autotrain
  - vision
  - image-classification
datasets:
  - ernie-ai/autotrain-data-document-text-language-ar-en-zh
widget:
  - src: >-
      https://huggingface.co./datasets/mishig/sample_images/resolve/main/tiger.jpg
    example_title: Tiger
  - src: >-
      https://huggingface.co./datasets/mishig/sample_images/resolve/main/teapot.jpg
    example_title: Teapot
  - src: >-
      https://huggingface.co./datasets/mishig/sample_images/resolve/main/palace.jpg
    example_title: Palace
co2_eq_emissions:
  emissions: 2.2266908460523576

finetuned-MS-swin-doc-text-classifer

This model is a fine-tuned version of Microsoft’s Swin Transformer tiny-sized model microsoft/swin-tiny-patch4-window7-224 on the ernie-ai/image-text-examples-ar-cn-latin-notext dataset. It achieves the following results on the evaluation set:

  • Loss: 0.267
  • Accuracy: 0.882

Model description

It is an image classificatin model fine-tuned to predict whether an images contains text and if that text is Latin script, Chinese or Arabic. It also classifies non-text images.

Training and evaluation data

Dataset: [ernie-ai/image-text-examples-ar-cn-latin-notext]

Model Trained Using AutoTrain

  • Problem type: Multi-class Classification
  • Model ID: 3338392240
  • CO2 Emissions (in grams): 2.2267

Validation Metrics

  • Loss: 0.267
  • Accuracy: 0.882
  • Macro F1: 0.862
  • Micro F1: 0.882
  • Weighted F1: 0.880
  • Macro Precision: 0.877
  • Micro Precision: 0.882
  • Weighted Precision: 0.883
  • Macro Recall: 0.856
  • Micro Recall: 0.882
  • Weighted Recall: 0.882