--- license: mit --- # Model discription and Inference **Image to Text** modeli bu asosan pre-trained qilingan model ustiga fine-tuned qilindi juda kam dataset bilan. epoch soni : 50 ta loss: 0.03.... train_time: o'rtacha 45 minute. # test **Juda ham kam dataset bilan fine-tuned qilingani uchun , ko'rsatilgan dataset imagelaridan foydalanish tafsiya qilaman.** *Dataset* image and uning dscription holatidan bo'ladi. misol uchun : ``` from datasets import load_dataset dataset = load_dataset("ybelkada/football-dataset", split="train") ``` ### Usage model ``` from transformers import AutoProcessor, BlipForConditionalGeneration processor = AutoProcessor.from_pretrained("ai-nightcoder/Image2text") model = BlipForConditionalGeneration.from_pretrained("ai-nightcoder/Image2text") ``` # image olamiz ``` example = dataset[0] image = example["image"] image ``` #### generate qismi. ``` inputs = processor(images=image, return_tensors="pt").to(device) pixel_values = inputs.pixel_values generated_ids = model.generate(pixel_values=pixel_values, max_length=50) generated_caption = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] print(generated_caption) ``` **Yuqorida ko'rsatgan tartibda modeldan foydalanishni tavsiya qilaman.**