--- license: cc-by-nc-4.0 language: - en pipeline_tag: zero-shot-image-classification widget: - src: https://huggingface.co./lhaas/StreetCLIP/resolve/main/nagasaki.jpg candidate_labels: China, South Korea, Japan, Phillipines, Taiwan, Vietnam, Cambodia example_title: Countries - src: https://huggingface.co./lhaas/StreetCLIP/resolve/main/sanfrancisco.jpeg candidate_labels: San Jose, San Diego, Los Angeles, Las Vegas, San Francisco, Seattle example_title: Cities - src: https://huggingface.co./lhaas/StreetCLIP/resolve/main/australia.jpeg candidate_labels: tropical climate, dry climate, temperate climate, continental climate, polar climate example_title: Climate library_name: transformers tags: - geolocalization - geolocation - geographic - street - climate - clip - urban - rural --- # Model Card for Model ID # Model Details ## Model Description - **Developed by:** Authors not disclosed - **Model type:** [CLIP](https://openai.com/blog/clip/) - **Language:** English - **License:** Create Commons Attribution Non Commercial 4.0 - **Finetuned from model:** [openai/clip-vit-large-patch14-336](https://huggingface.co./openai/clip-vit-large-patch14-336) ## Model Sources - **Paper:** Pre-print available soon .. - **Demo:** Currently in development ... # Uses ## Direct Use [More Information Needed] ## Downstream Use [optional] [More Information Needed] ## Out-of-Scope Use [More Information Needed] # Bias, Risks, and Limitations [More Information Needed] ## Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. ```python from PIL import Image import requests from transformers import CLIPProcessor, CLIPModel model = CLIPModel.from_pretrained("lhaas/StreetCLIP") processor = CLIPProcessor.from_pretrained("lhaas/StreetCLIP") url = "https://huggingface.co./lhaas/StreetCLIP/resolve/main/sanfrancisco.jpeg" image = Image.open(requests.get(url, stream=True).raw) choices = ["San Jose", "San Diego", "Los Angeles", "Las Vegas", "San Francisco"] inputs = processor(text=choices, images=image, return_tensors="pt", padding=True) outputs = model(**inputs) logits_per_image = outputs.logits_per_image # this is the image-text similarity score probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities ``` # Training Details ## Training Data [More Information Needed] ## Training Procedure [optional] ### Preprocessing [More Information Needed] ### Speeds, Sizes, Times [More Information Needed] # Evaluation ## Testing Data, Factors & Metrics ### Testing Data [More Information Needed] ### Factors [More Information Needed] ### Metrics [More Information Needed] ## Results [More Information Needed] ### Summary # Model Examination [optional] [More Information Needed] # Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** 4 NVIDIA A100 GPUs - **Hours used:** 12 # Example Image Attribution [More information needed] # Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed]