--- tags: - image-to-text - image-captioning - endpoints-template license: bsd-3-clause library_name: generic --- # Fork of [salesforce/BLIP](https://github.com/salesforce/BLIP) for a `image-captioning` task on 🤗Inference endpoint. This repository implements a `custom` task for `image-captioning` for 🤗 Inference Endpoints. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co./florentgbelidji/blip_captioning/blob/main/pipeline.py). To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `pipeline.py` file. -> _double check if it is selected_ ### expected Request payload ```json { "image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes } ``` below is an example on how to run a request using Python and `requests`. ## Run Request 1. prepare an image. ```bash !wget https://huggingface.co./datasets/mishig/sample_images/resolve/main/palace.jpg ``` 2.run request ```python import json from typing import List import requests as r import base64 ENDPOINT_URL = "" HF_TOKEN = "" def predict(path_to_image: str = None): with open(path_to_image, "rb") as i: image = i.read() payload = { "inputs": [image], "parameters": { "do_sample": True, "top_p":0.9, "min_length":5, "max_length":20 } } response = r.post( ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload ) return response.json() prediction = predict( path_to_image="palace.jpg" ) ``` Example parameters depending on the decoding strategy: 1. Beam search ``` "parameters": { "num_beams":5, "max_length":20 } ``` 2. Nucleus sampling ``` "parameters": { "num_beams":1, "max_length":20, "do_sample": True, "top_k":50, "top_p":0.95 } ``` 3. Contrastive search ``` "parameters": { "penalty_alpha":0.6, "top_k":4 "max_length":512 } ``` See [generate()](https://huggingface.co./docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) doc for additional detail expected output ```python ['buckingham palace with flower beds and red flowers'] ```