--- language: - ko - en license: cc-by-nc-sa-4.0 library_name: transformers --- # Llama-3-KoEn-8B-xtuner-llava-preview πŸŒ‹ Llama-3-KoEn-8B-xtuner-llava-preview πŸŒ‹ is Korean based MutliModal based on Llava architecture, merged with [ChatVector](https://arxiv.org/abs/2310.04799) methods leveraging 2 models: 1) [beomi/Llama-3-KoEn-8B-preview](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview) 2) [xtuner/llava-llama-3-8b-transformers](https://huggingface.co./xtuner/llava-llama-3-8b-transformers) ## Model Details ### Model Description - **Developed by:** Junbum Lee (Beomi) - **Model type:** HuggingFace Llava πŸŒ‹ - **Language(s) (NLP):** Korean, English - **License:** cc-by-nc-sa-4.0 under Llama3 License - **Merged from model:** [beomi/Llama-3-KoEn-8B-preview](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview) & [xtuner/llava-llama-3-8b-transformers](https://huggingface.co./xtuner/llava-llama-3-8b-transformers) ### Direct Use ![Cat walking on frozen Han-River, Seoul](https://cdn-uploads.huggingface.co/production/uploads/5e56829137cb5b49818287ea/NWfoArWI4UPAxpEnolkwT.jpeg) > Two version recommended > > v1. `revision='a38aac3'`: Basic ChatVector, with [25B+ trained KoEn ckpt(rev. d4d25a2)](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview/commit/d4d25a2). > > v1-1. `revision='0224971'`: Basic ChatVector, with [40B+ trained KoEn ckpt(rev. ad39b32)](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview/commit/ad39b32cd4207f37f61f16e79d3f4020c5b744ef). > > v2. `revision='4f04d1e'`: Model diff based merging(ref. https://huggingface.co./blog/maywell/llm-feature-transfer), with [25B+ trained KoEn ckpt(rev. d4d25a2)](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview/commit/d4d25a2). ```python import requests from PIL import Image import torch from transformers import AutoProcessor, LlavaForConditionalGeneration model_id = "beomi/Llama-3-KoEn-8B-xtuner-llava-preview" model = LlavaForConditionalGeneration.from_pretrained( model_id, torch_dtype='auto', device_map='auto', revision='a38aac3', # 'a38aac3' for basic ChatVector, '4f04d1e' for Model diff based merging(ref. https://huggingface.co./blog/maywell/llm-feature-transfer) ) processor = AutoProcessor.from_pretrained(model_id) tokenizer = processor.tokenizer terminators = [ tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>") ] prompt = ("<|start_header_id|>user<|end_header_id|>\n\n\n이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|>" "<|start_header_id|>assistant<|end_header_id|>\n\n이 μ΄λ―Έμ§€μ—λŠ”") image_file = "https://cdn-uploads.huggingface.co/production/uploads/5e56829137cb5b49818287ea/NWfoArWI4UPAxpEnolkwT.jpeg" raw_image = Image.open(requests.get(image_file, stream=True).raw) inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16) output = model.generate(**inputs, max_new_tokens=400, do_sample=True, eos_token_id=terminators,) print(processor.decode(output[0][2:], skip_special_tokens=False)) # --- Example Output [v1, Chat Vector] --- user<|end_header_id|> 이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|><|start_header_id|>assistant<|end_header_id|> 이 μ΄λ―Έμ§€μ—λŠ” 고양이 ν•œ λ§ˆλ¦¬κ°€ κ°•λ¬Ό μœ„λ₯Ό κ±Έμ–΄κ°€λŠ” λͺ¨μŠ΅μ΄ λ³΄μ—¬μ§‘λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” κ°•λ¬Όμ˜ μž”λ¬Όκ²°μ— λ―Έλ„λŸΌμ„ 타고 κ°• κ°€λ‘œλ₯Ό μ§€λ‚˜λŠ” 데 λŠ₯μˆ™ν•˜κ²Œ λ³΄μž…λ‹ˆλ‹€. κ³ μ–‘μ΄μ˜ λ°œμ€ κ°•λ¬Όλ‘œ 잘 λ“€μ–΄κ°€, 그것을 즐기며 κ±Έμ–΄κ°‘λ‹ˆλ‹€. λ˜ν•œ 이 이미지도 μŒμ„± λ…ΉμŒμ„ ν•˜κ±°λ‚˜ λ…Ήν™”λœ 자료둜 μ œμž‘λ˜μ—ˆμœΌλ©°, 주둜 κ³ μ–‘μ΄μ˜ λͺ¨μŠ΅μ„ κ°•ν•˜κ²Œ λ³΄μ—¬μ€λ‹ˆλ‹€. μ†Œλ¦¬ νš¨κ³Όλ„ μ—¬λŸ¬ κ°€μ§€λ‘œ μΆ”κ°€ν•˜μ—¬ κ³ μ–‘μ΄μ˜ μŠ€ν† λ¦¬λ₯Ό λ‹€μ–‘ν•˜κ²Œ μ „λ‹¬ν•©λ‹ˆλ‹€. 강물은 μž”λ¬Όκ²°μ„ λ‚˜νƒ€λ‚΄λ©° κ°•λ¬Ό μœ„λ₯Ό κ±·λŠ” κ³ μ–‘μ΄μ˜ λͺ¨μŠ΅μ„ λ”μš± κ°•λ ¬ν•˜κ²Œ κ°•μ‘°ν•˜κΈ° μœ„ν•΄ μž”λ¬Όκ²°μ„ 톡해 더 λ””ν…ŒμΌν•œ μž₯면을 λ³΄μ—¬μ€λ‹ˆλ‹€.<|eot_id|> # --- Example Output [v1-1, Chat Vector] --- user<|end_header_id|> 이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|><|start_header_id|>assistant<|end_header_id|> 이 μ΄λ―Έμ§€μ—μ„œλŠ” ν•œ 고양이가 μ„œν•΄μ•ˆμ— μœ„μΉ˜ν•œ λ°”λ‹€λ₯Ό κ±·κ³  μžˆλŠ” λͺ¨μŠ΅μ„ λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” ν•΄λ³€μ—μ„œλΆ€ν„° λ°”λ‹€λ‘œ κ±Έμ–΄λ“€μ–΄κ°€λŠ” 쀑이며, μ£Όλ³€μ—λŠ” μž”μž”ν•œ νŒŒλ„κ°€ λ°€λ €μ˜€λŠ” λͺ¨μŠ΅μ„ 보여주고 μžˆμŠ΅λ‹ˆλ‹€. 이 κ³ μ–‘μ΄λŠ” νƒœμ–΄λ‚  λ•ŒλΆ€ν„° 고양이와 κ°•μ•„μ§€μ™€λŠ” λ‹€λ₯΄κ²Œ λ°”λ‹€λ₯Ό κ²½ν—˜ν•˜κ³ , 적응해가고 μžˆμŠ΅λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” λ°”λ‹€λ₯Ό μ’‹μ•„ν•˜κ³ , 이 ν™˜κ²½μ—μ„œ 행볡을 λŠλΌλŠ” 것 κ°™μŠ΅λ‹ˆλ‹€. 이 κ³ μ–‘μ΄λŠ” 인간이 μ•„λ‹Œ μžμ—°μ˜ μΌλΆ€λ‘œμ¨ 이 ν™˜κ²½μ—μ„œ μ‚΄μ•„κ°€κ³  μžˆμŠ΅λ‹ˆλ‹€.<|eot_id|> # --- Example Output [v2, Model diff based merging] --- user<|end_header_id|> 이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|><|start_header_id|>assistant<|end_header_id|> 이 μ΄λ―Έμ§€μ—λŠ” ν•œκ΅­μ–΄ μžλ§‰κ³Ό ν•¨κ»˜ 고양이가 물에 λ°œμ„ λ””λ””κ³  κ±·λŠ” λͺ¨μŠ΅μ΄ 담겨 μžˆμŠ΅λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” 였λ₯Έμͺ½ λ°œμ„ 물에 λ‹΄κ·Έκ³  κ±·λŠ” 쀑이며, ν•œκ΅­μ–΄ μžλ§‰μ€ "κ³ μ–‘μ΄λŠ” 물을 μ’‹μ•„ν•©λ‹ˆλ‹€"λΌλŠ” λ¬Έμž₯을 ν¬ν•¨ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. 이 μžλ§‰μ€ 고양이가 물을 μ’‹μ•„ν•˜λŠ” 것을 κ°•μ‘°ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.<|eot_id|> ```