--- library_name: transformers license: apache-2.0 pipeline_tag: text-generation datasets: - maywell/ko_Ultrafeedback_binarized base model: - yanolja/EEVE-Korean-Instruct-10.8B-v1.0 --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f22e4076fedc4fd11e978f/MoTedec_ZL8GM2MmGyAPs.png) # T3Q-LLM-sft1.0-dpo1.0 ## This model is a version of T3Q-LLM/T3Q-LLM-solar10.8-sft-v1.0 that has been fine-tuned with DPO. ## Model Developers Chihoon Lee(chihoonlee10), T3Q ## Prompt Template ``` A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. Human: {prompt} Assistant: ``` ## How to Use it ```python from transformers import AutoTokenizer from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0") tokenizer = AutoTokenizer.from_pretrained("T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0") prompt_template = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman: {prompt}\nAssistant:\n" text = '한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.\n\n(A) 경성\n(B) 부산\n(C) 평양\n(D) 서울\n(E) 전주' model_inputs = tokenizer(prompt_template.format(prompt=text), return_tensors='pt') outputs = model.generate(**model_inputs, max_new_tokens=256) output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0] print(output_text) ``` ### Example Output ``` A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. Human: 한국의 수도는 어디인가요? 아래 선택지 중 골라주세요. (A) 경성 (B) 부산 (C) 평양 (D) 서울 (E) 전주 Assistant: (D) 서울이 한국의 수도입니다. 서울은 나라의 북동부에 위치해 있으며, 정치, 경제, 문화의 중심지입니다. 약 1,000만 명이 넘는 인구를 가진 세계에서 가장 큰 도시 중 하나입니다. 서울은 높은 빌딩, 현대적인 인프라, 활기 문화 장면으로 유명합니다. 또한, 많은 역사적 명소와 박물관이 있어 방문객들에게 풍부한 문화 체험을 제공합니다. ``` | Task |Version| Metric |Value | |Stderr| |----------------|------:|--------|-----:|---|-----:| |kobest_boolq | 0|acc |0.9387|± |0.0064| | | |macro_f1|0.9387|± |0.0064| |kobest_copa | 0|acc |0.7590|± |0.0135| | | |macro_f1|0.7585|± |0.0135| |kobest_hellaswag| 0|acc |0.5080|± |0.0224| | | |acc_norm|0.5580|± |0.0222| | | |macro_f1|0.5049|± |0.0224| |kobest_sentineg | 0|acc |0.8489|± |0.0180| | | |macro_f1|0.8483|± |0.0180| hf-causal-experimental (pretrained=nlpai-lab/KULLM3,use_accelerate=true,trust_remote_code=true), limit: None, provide_description: False, num_fewshot: 0, batch_size: 8 | Task |Version| Metric |Value | |Stderr| |----------------|------:|--------|-----:|---|-----:| |kobest_boolq | 0|acc |0.8896|± |0.0084| | | |macro_f1|0.8888|± |0.0084| |kobest_copa | 0|acc |0.6930|± |0.0146| | | |macro_f1|0.6925|± |0.0147| |kobest_hellaswag| 0|acc |0.4640|± |0.0223| | | |acc_norm|0.5240|± |0.0224| | | |macro_f1|0.4612|± |0.0223| |kobest_sentineg | 0|acc |0.6297|± |0.0243| | | |macro_f1|0.6255|± |0.0244|