Spaces:

clip-italian
/

clip-italian-demo

Running

App Files Files Community

vinid commited on Jul 25, 2021

Commit

8e64654

1 Parent(s): 9c82e81

introduction updates

Browse files

Files changed (4) hide show

introduction.md +10 -1
static/img/gatto_cane.png +0 -0
static/img/image_to_text.png +0 -0
static/img/text_to_image.png +0 -0

introduction.md CHANGED Viewed

@@ -9,7 +9,7 @@ is built upon the pre-trained [Italian BERT](https://huggingface.co/dbmdz/bert-b
 In building this project we kept in mind the following principles:
-+ **Novel Contributions**: We created a dataset of ~1.4 million Italian image-text pairs (**that we will share with the community**) and, to the best of our knowledge, we trained the best Italian CLIP model currently in existence;
 + **Scientific Validity**: Claim are easy, facts are hard. That's why validation is important to assess the real impact of a model. We thoroughly evaluated our models on two tasks and made the validation reproducible for everybody.
 + **Broader Outlook**: We always kept in mind which are the possible usages and limitations of this model.
@@ -25,9 +25,18 @@ In this demo, we present two tasks:
 compute the similarity between this string of text with respect to a set of images. The webapp is going to display the images that
 have the highest similarity with the text query.
 + *Image to Text*: This task is essentially a zero-shot image classification task. The user is asked for an image and for a set of captions/labels and CLIP
 is going to compute the similarity between the image and each label. The webapp is going to display a probability distribution over the captions.
 + *Examples & Applications*: This page showcases some interesting results we got from the model, we believe that there are
 different applications that can start from here.

 In building this project we kept in mind the following principles:
++ **Novel Contributions**: We created an impressive dataset of ~1.4 million Italian image-text pairs (**that we will share with the community**) and, to the best of our knowledge, we trained the best Italian CLIP model currently in existence;
 + **Scientific Validity**: Claim are easy, facts are hard. That's why validation is important to assess the real impact of a model. We thoroughly evaluated our models on two tasks and made the validation reproducible for everybody.
 + **Broader Outlook**: We always kept in mind which are the possible usages and limitations of this model.
 compute the similarity between this string of text with respect to a set of images. The webapp is going to display the images that
 have the highest similarity with the text query.
+<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/text_to_image.png" alt="drawing" width="95%"/>
 + *Image to Text*: This task is essentially a zero-shot image classification task. The user is asked for an image and for a set of captions/labels and CLIP
 is going to compute the similarity between the image and each label. The webapp is going to display a probability distribution over the captions.
+<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/image_to_text.png" alt="drawing" width="95%"/>
++ *Localization*: This is one of ours **very cool** features and at the best of our knowledge, it is a novel contribution. We can use CLIP
+to find where "something" (like a "cat") is an image. The location of the object is computed by masking different areas of the image and looking at how the similarity to the image description changes.
+<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/gatto_cane.png" alt="drawing" width="95%"/>
 + *Examples & Applications*: This page showcases some interesting results we got from the model, we believe that there are
 different applications that can start from here.

static/img/gatto_cane.png ADDED Viewed

static/img/image_to_text.png ADDED Viewed

static/img/text_to_image.png ADDED Viewed