This repo contains model for generation poetry in style of Mayakovsky from image. The model is fune-tuned concatecation of two pre-trained models: google/vit-base-patch16-224 as encoder and AnyaSchen/rugpt3_mayak as decoder.

To use this model you can do:

from PIL import Image
import requests
from transformers import AutoTokenizer, VisionEncoderDecoderModel, ViTImageProcessor

def generate_poetry(fine_tuned_model, image, tokenizer):
    pixel_values = feature_extractor(images=image, return_tensors="pt").pixel_values
    pixel_values = pixel_values.to(device)

    # Generate the poetry with the fine-tuned VisionEncoderDecoder model
    generated_tokens = fine_tuned_model.generate(
        pixel_values,
        max_length=300,
        num_beams=3,
        top_p=0.8,
        temperature=2.0,
        do_sample=True,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

    # Decode the generated tokens
    generated_poetry = tokenizer.decode(generated_tokens[0], skip_special_tokens=True)
    return generated_poetry

path = 'AnyaSchen/vit-rugpt3-medium-mayak'
fine_tuned_model = VisionEncoderDecoderModel.from_pretrained(path).to(device)
feature_extractor = ViTImageProcessor.from_pretrained(path)
tokenizer = AutoTokenizer.from_pretrained(path)

url = 'https://anandaindia.org/wp-content/uploads/2018/12/happy-man.jpg'
image = Image.open(requests.get(url, stream=True).raw)

generated_poetry = generate_poetry(fine_tuned_model, image, tokenizer)
print(generated_poetry)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train AnyaSchen/vit_rugpt3_medium_mayak