AEnigmista
/

Sardegna-ViT

Image Classification

Inference Endpoints

Model card Files Files and versions Community

Sardegna-ViT / README.md

AEnigmista's picture

Update README.md

f57fda5 verified 8 months ago

|

history blame contribute delete

1.52 kB

	---
	library_name: transformers
	tags: []
	---

	# Model Card for Model ID

	This model is a finetuned image classificator based on [vit-base-patch16-224](https://huggingface.co./google/vit-base-patch16-224)
	This model takes as input a picture from google maps' street view representing a road and returns a walkability score from 0 (worst score) to 4 (best score)

	# How to Use
	Load the model with the following code:
	```python
	from transformers import AutoModelForImageClassification
	model = AutoModelForImageClassification.from_pretrained("AEnigmista/Sardegna-ViT", num_labels=5, ignore_mismatched_sizes=True)
	```

	For more information on the code: please visit the [github repo](https://github.com/MatteoMocci/Most-Walkability-AI)
	# Training Hyper-parameters
	This version's hyper-parameters for training are:
	- Fp16 = True
	- batch size = 32
	- 10 epochs
	- learning rate = 1e-4
	- optimizer = 'adamw_hf'

	# Metrics
	The metrics that are used for evaluation are accuracy, recall, precision, mse, confusion matrix and a custom metric called one_out. The one_out_accuracy uses
	the confusion matrix to check how many predictions of the model are within 1 from the ground truth (so label 2 is considered correct if ground truth is 1 or 3, incorrect if 0 or 5).
	Since each label is actually a walkability score, this metric is useful to see how many predictions of the model are correct or pretty close to the expected value, and, thus,
	how many predictions are way off (for example a street with 0 walkability score is predicted as 4)