---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- imagefolder
metrics:
- accuracy
model-index:
- name: vit-base-avengers-v1
  results:
  - task:
      name: Image Classification
      type: image-classification
    dataset:
      name: imagefolder
      type: imagefolder
      args: avengers-dataset
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.8683385579937304
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# vit-base-avengers-v1

This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co./google/vit-base-patch16-224-in21k) on the imagefolder dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5324
- Accuracy: 0.8683

Refer to this [medium article](https://medium.com/@dingusagar/marvel-character-classification-by-fine-tuning-vision-transformer-45c14a7d8719) for more info on how it was trained. 


## Limitations
Training was done on google images for these search terms each representing a class.
Iron Man,Captain America,Thor,Spider Man,Docter Strage,Black Panther,Ant Man,Captain Marvel,Hulk,Black Widow,Hawkeye Avengers,Scarlet Witch,Vision Avengers,Bucky Barnes,Falcon Avengers,Loki

Therefore it has seen more of images where these super heros are in their suit or superhero outfit. 
For example an image of hulk is detected correctly, but an image of Bruce Banner is not simply because the model has't seen those images. 
A little bit of data augmentation will help. 

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
| 0.8183        | 1.27  | 100  | 1.0134          | 0.8464   |
| 0.2234        | 2.53  | 200  | 0.6146          | 0.8495   |
| 0.1206        | 3.8   | 300  | 0.5324          | 0.8683   |


### Framework versions

- Transformers 4.20.1
- Pytorch 1.11.0+cu113
- Datasets 2.3.2
- Tokenizers 0.12.1