|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
This model is a finetuned image classificator based on [vit-base-patch16-224](https://huggingface.co./google/vit-base-patch16-224) |
|
This model takes as input a picture from google maps' street view representing a road and returns a walkability score from 0 (worst score) to 4 (best score) |
|
|
|
# How to Use |
|
Load the model with the following code: |
|
```python |
|
from transformers import AutoModelForImageClassification |
|
model = AutoModelForImageClassification.from_pretrained("AEnigmista/Sardegna-ViT", num_labels=5, ignore_mismatched_sizes=True) |
|
``` |
|
|
|
For more information on the code: please visit the [github repo](https://github.com/MatteoMocci/Most-Walkability-AI) |
|
# Training Hyper-parameters |
|
This version's hyper-parameters for training are: |
|
- Fp16 = True |
|
- batch size = 32 |
|
- 10 epochs |
|
- learning rate = 1e-4 |
|
- optimizer = 'adamw_hf' |
|
|
|
# Metrics |
|
The metrics that are used for evaluation are accuracy, recall, precision, mse, confusion matrix and a custom metric called one_out. The one_out_accuracy uses |
|
the confusion matrix to check how many predictions of the model are within 1 from the ground truth (so label 2 is considered correct if ground truth is 1 or 3, incorrect if 0 or 5). |
|
Since each label is actually a walkability score, this metric is useful to see how many predictions of the model are correct or pretty close to the expected value, and, thus, |
|
how many predictions are way off (for example a street with 0 walkability score is predicted as 4) |