beomi's picture
Upload processor
17b3b21 verified
|
raw
history blame
3.39 kB
metadata
language:
  - ko
  - en
license: cc-by-nc-sa-4.0
library_name: transformers

Llama-3-KoEn-8B-xtuner-llava-preview ๐ŸŒ‹

Llama-3-KoEn-8B-xtuner-llava-preview ๐ŸŒ‹ is Korean based MutliModal based on Llava architecture, merged with ChatVector methods leveraging 2 models:

  1. beomi/Llama-3-KoEn-8B-preview,
  2. xtuner/llava-llama-3-8b-transformers

Model Details

Model Description

Direct Use

Cat walking on frozen Han-River, Seoul

import requests
from PIL import Image

import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration

model_id = "beomi/Llama-3-KoEn-8B-xtuner-llava-preview"

model = LlavaForConditionalGeneration.from_pretrained(
    model_id, 
    torch_dtype='auto', 
    device_map='auto',
)

processor = AutoProcessor.from_pretrained(model_id)

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('./llava-llama-3-KoEn-8b-v1_1-transformers')
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

prompt = ("<|start_header_id|>user<|end_header_id|>\n\n<image>\n์ด ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด์„œ ์„ค๋ช…ํ•ด์ฃผ์„ธ์š”.<|eot_id|>"
          "<|start_header_id|>assistant<|end_header_id|>\n\n์ด ์ด๋ฏธ์ง€์—๋Š”")
image_file = "https://cdn-uploads.huggingface.co/production/uploads/5e56829137cb5b49818287ea/NWfoArWI4UPAxpEnolkwT.jpeg"

raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16)

output = model.generate(**inputs, max_new_tokens=400, do_sample=True, eos_token_id=terminators,)
print(processor.decode(output[0][2:], skip_special_tokens=False))

# --- Example Output ---
user<|end_header_id|>

<image>
์ด ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด์„œ ์„ค๋ช…ํ•ด์ฃผ์„ธ์š”.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

์ด ์ด๋ฏธ์ง€์—๋Š” ๊ณ ์–‘์ด ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๊ฐ•๋ฌผ ์œ„๋ฅผ ๊ฑธ์–ด๊ฐ€๋Š” ๋ชจ์Šต์ด ๋ณด์—ฌ์ง‘๋‹ˆ๋‹ค. ๊ณ ์–‘์ด๋Š” ๊ฐ•๋ฌผ์˜ ์ž”๋ฌผ๊ฒฐ์— ๋ฏธ๋„๋Ÿผ์„ ํƒ€๊ณ  ๊ฐ• ๊ฐ€๋กœ๋ฅผ ์ง€๋‚˜๋Š” ๋ฐ ๋Šฅ์ˆ™ํ•˜๊ฒŒ ๋ณด์ž…๋‹ˆ๋‹ค. ๊ณ ์–‘์ด์˜ ๋ฐœ์€ ๊ฐ•๋ฌผ๋กœ ์ž˜ ๋“ค์–ด๊ฐ€, ๊ทธ๊ฒƒ์„ ์ฆ๊ธฐ๋ฉฐ ๊ฑธ์–ด๊ฐ‘๋‹ˆ๋‹ค. 

๋˜ํ•œ ์ด ์ด๋ฏธ์ง€๋„ ์Œ์„ฑ ๋…น์Œ์„ ํ•˜๊ฑฐ๋‚˜ ๋…นํ™”๋œ ์ž๋ฃŒ๋กœ ์ œ์ž‘๋˜์—ˆ์œผ๋ฉฐ, ์ฃผ๋กœ ๊ณ ์–‘์ด์˜ ๋ชจ์Šต์„ ๊ฐ•ํ•˜๊ฒŒ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์†Œ๋ฆฌ ํšจ๊ณผ๋„ ์—ฌ๋Ÿฌ ๊ฐ€์ง€๋กœ ์ถ”๊ฐ€ํ•˜์—ฌ ๊ณ ์–‘์ด์˜ ์Šคํ† ๋ฆฌ๋ฅผ ๋‹ค์–‘ํ•˜๊ฒŒ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ๊ฐ•๋ฌผ์€ ์ž”๋ฌผ๊ฒฐ์„ ๋‚˜ํƒ€๋‚ด๋ฉฐ ๊ฐ•๋ฌผ ์œ„๋ฅผ ๊ฑท๋Š” ๊ณ ์–‘์ด์˜ ๋ชจ์Šต์„ ๋”์šฑ ๊ฐ•๋ ฌํ•˜๊ฒŒ ๊ฐ•์กฐํ•˜๊ธฐ ์œ„ํ•ด ์ž”๋ฌผ๊ฒฐ์„ ํ†ตํ•ด ๋” ๋””ํ…Œ์ผํ•œ ์žฅ๋ฉด์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.<|eot_id|>