T3Q-LLM-sft1.0-dpo1.0
This model is a version of T3Q-LLM/T3Q-LLM-solar10.8-sft-v1.0 that has been fine-tuned with DPO.
Model Developers Chihoon Lee(chihoonlee10), T3Q
Prompt Template
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: {prompt}
Assistant:
How to Use it
from transformers import AutoTokenizer
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0")
tokenizer = AutoTokenizer.from_pretrained("T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0")
prompt_template = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman: {prompt}\nAssistant:\n"
text = 'ํ๊ตญ์ ์๋๋ ์ด๋์ธ๊ฐ์? ์๋ ์ ํ์ง ์ค ๊ณจ๋ผ์ฃผ์ธ์.\n\n(A) ๊ฒฝ์ฑ\n(B) ๋ถ์ฐ\n(C) ํ์\n(D) ์์ธ\n(E) ์ ์ฃผ'
model_inputs = tokenizer(prompt_template.format(prompt=text), return_tensors='pt')
outputs = model.generate(**model_inputs, max_new_tokens=256)
output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(output_text)
Example Output
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: ํ๊ตญ์ ์๋๋ ์ด๋์ธ๊ฐ์? ์๋ ์ ํ์ง ์ค ๊ณจ๋ผ์ฃผ์ธ์.
(A) ๊ฒฝ์ฑ
(B) ๋ถ์ฐ
(C) ํ์
(D) ์์ธ
(E) ์ ์ฃผ
Assistant:
(D) ์์ธ์ด ํ๊ตญ์ ์๋์
๋๋ค. ์์ธ์ ๋๋ผ์ ๋ถ๋๋ถ์ ์์นํด ์์ผ๋ฉฐ, ์ ์น, ๊ฒฝ์ , ๋ฌธํ์ ์ค์ฌ์ง์
๋๋ค. ์ฝ 1,000๋ง ๋ช
์ด ๋๋ ์ธ๊ตฌ๋ฅผ ๊ฐ์ง ์ธ๊ณ์์ ๊ฐ์ฅ ํฐ ๋์ ์ค ํ๋์
๋๋ค. ์์ธ์ ๋์ ๋น๋ฉ, ํ๋์ ์ธ ์ธํ๋ผ, ํ๊ธฐ ๋ฌธํ ์ฅ๋ฉด์ผ๋ก ์ ๋ช
ํฉ๋๋ค. ๋ํ, ๋ง์ ์ญ์ฌ์ ๋ช
์์ ๋ฐ๋ฌผ๊ด์ด ์์ด ๋ฐฉ๋ฌธ๊ฐ๋ค์๊ฒ ํ๋ถํ ๋ฌธํ ์ฒดํ์ ์ ๊ณตํฉ๋๋ค.
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
kobest_boolq | 0 | acc | 0.9387 | ยฑ | 0.0064 |
macro_f1 | 0.9387 | ยฑ | 0.0064 | ||
kobest_copa | 0 | acc | 0.7590 | ยฑ | 0.0135 |
macro_f1 | 0.7585 | ยฑ | 0.0135 | ||
kobest_hellaswag | 0 | acc | 0.5080 | ยฑ | 0.0224 |
acc_norm | 0.5580 | ยฑ | 0.0222 | ||
macro_f1 | 0.5049 | ยฑ | 0.0224 | ||
kobest_sentineg | 0 | acc | 0.8489 | ยฑ | 0.0180 |
macro_f1 | 0.8483 | ยฑ | 0.0180 |
hf-causal-experimental (pretrained=nlpai-lab/KULLM3,use_accelerate=true,trust_remote_code=true), limit: None, provide_description: False, num_fewshot: 0, batch_size: 8
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
kobest_boolq | 0 | acc | 0.8896 | ยฑ | 0.0084 |
macro_f1 | 0.8888 | ยฑ | 0.0084 | ||
kobest_copa | 0 | acc | 0.6930 | ยฑ | 0.0146 |
macro_f1 | 0.6925 | ยฑ | 0.0147 | ||
kobest_hellaswag | 0 | acc | 0.4640 | ยฑ | 0.0223 |
acc_norm | 0.5240 | ยฑ | 0.0224 | ||
macro_f1 | 0.4612 | ยฑ | 0.0223 | ||
kobest_sentineg | 0 | acc | 0.6297 | ยฑ | 0.0243 |
macro_f1 | 0.6255 | ยฑ | 0.0244 |
- Downloads last month
- 2,929
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.