|
--- |
|
library_name: transformers |
|
metrics: |
|
- accuracy |
|
base_model: |
|
- microsoft/swin-large-patch4-window12-384-in22k |
|
license: apache-2.0 |
|
tags: |
|
- vision |
|
- image-classification |
|
model-index: |
|
- name: cub-200-bird-classifier-swin |
|
results: |
|
- task: |
|
name: Image Classification |
|
type: image-classification |
|
dataset: |
|
name: cub-200-subset |
|
type: cub-200-subset |
|
args: default |
|
metrics: |
|
- name: validation_accuracy |
|
type: accuracy |
|
value: 0.86530 |
|
- name: test_accuracy |
|
type: accuracy |
|
value: 0.87950 |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
data:image/s3,"s3://crabby-images/79e3e/79e3efe912cf5545dd814c625c155a7f2ba35011" alt="image/png" |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This model was created for the "Feather in Focus!" Kaggle competition of the Information Studies Master's Applied Machine Learning course at the University of Amsterdam. |
|
The goal of the competition was to apply novel approaches to achieve the highest possible accuracy on a bird classification task with 200 classes. |
|
We were given a labeled dataset of 3,926 images and an unlabeled dataset of 4,000 test images. |
|
Out of 32 groups and 1,083 submissions, we achieved the #1 accuracy on the test set with a score of 0.87950. |
|
|
|
- **Model type:** [More Information Needed] |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** [More Information Needed] |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The training data consists of an unknown subset of the cub-200-2011 dataset, https://paperswithcode.com/dataset/cub-200-2011 |
|
|
|
### Training Procedure |
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
#### Preprocessing |
|
|
|
Data augmentation was applied to the training data in a custom Torch dataset class. Because of the size of the dataset images were not replaced but duplicated and augmented. |
|
The only augmentations applied were HorizontalFlips and Rotations (10 degrees) to align with the relatively homogenous dataset. |
|
|
|
#### Training Hyperparameters |
|
|
|
| Hyperparameter | Value | |
|
|-----------------------|----------------------------| |
|
| Optimizer | AdamW | |
|
| Learning Rate | 1e-4 | |
|
| Batch Size | 32 | |
|
| Epochs | 2 | |
|
| Weight Decay | - | |
|
| Scheduler | - | |
|
| Mixed Precision | Torch AMP | |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. --> |
|
|
|
[More Information Needed] |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
The testing data consists of an unknown subset of the cub-200-2011 dataset, https://paperswithcode.com/dataset/cub-200-2011 |
|
|
|
[More Information Needed] |
|
|
|
#### Factors |
|
|
|
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. --> |
|
|
|
[More Information Needed] |
|
|
|
#### Metrics |
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
|