metadata
license: cc-by-nc-4.0
language:
- en
pipeline_tag: zero-shot-image-classification
widget:
- src: https://huggingface.co./lhaas/StreetCLIP/resolve/main/nagasaki.jpg
candidate_labels: China, South Korea, Japan, Phillipines, Taiwan, Vietnam, Cambodia
example_title: Countries
- src: https://huggingface.co./lhaas/StreetCLIP/resolve/main/sanfrancisco.jpeg
candidate_labels: San Jose, San Diego, Los Angeles, Las Vegas, San Francisco, Seattle
example_title: Cities
- src: https://huggingface.co./lhaas/StreetCLIP/resolve/main/australia.jpeg
candidate_labels: >-
tropical climate, dry climate, temperate climate, continental climate,
polar climate
example_title: Climate
library_name: transformers
tags:
- geolocalization
- geolocation
- geographic
- street
- climate
- clip
- urban
- rural
Model Card for Model ID
Model Details
Model Description
- Developed by: Authors not disclosed
- Model type: CLIP
- Language: English
- License: Create Commons Attribution Non Commercial 4.0
- Finetuned from model: openai/clip-vit-large-patch14-336
Model Sources
- Paper: Pre-print available soon ..
- Demo: Currently in development ...
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
from PIL import Image
import requests
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("lhaas/StreetCLIP")
processor = CLIPProcessor.from_pretrained("lhaas/StreetCLIP")
url = "https://huggingface.co./lhaas/StreetCLIP/resolve/main/sanfrancisco.jpeg"
image = Image.open(requests.get(url, stream=True).raw)
choices = ["San Jose", "San Diego", "Los Angeles", "Las Vegas", "San Francisco"]
inputs = processor(text=choices, images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
Training Details
Training Data
[More Information Needed]
Training Procedure [optional]
Preprocessing
[More Information Needed]
Speeds, Sizes, Times
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: 4 NVIDIA A100 GPUs
- Hours used: 12
Example Image Attribution
[More information needed]
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]