Gemma-2 RAG Generator ๋ชจ๋ธ README


์†Œ๊ฐœ

์ด ๋ชจ๋ธ์€ RAG(Retrieval-Augmented Generation) ์‹œ์Šคํ…œ์˜ Generator ๋ถ€๋ถ„์„ ์ตœ์ ํ™”ํ•œ ํ•œ๊ตญ์–ด ๊ธฐ๋ฐ˜ 2B ๋ชจ๋ธ, Gemma-2์ž…๋‹ˆ๋‹ค.
Gemma-2๋Š” ์™ธ๋ถ€ ๊ฒ€์ƒ‰์—์„œ ๋ฐ˜ํ™˜๋œ ์ •๋ณด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž์—ฐ์Šค๋Ÿฌ์šด ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜๋ฉฐ, ํ•„์š”ํ•œ ๊ฒฝ์šฐ ๊ฐ์ฃผ๋ฅผ ๋ถ™์—ฌ ์ถœ์ฒ˜๋ฅผ ๋ช…์‹œํ•ฉ๋‹ˆ๋‹ค.
ํŠนํžˆ, ๋Œ€ํ™” ๊ธฐ๋ก, ์ปจํ…์ŠคํŠธ ์ •๋ณด(๋ฌธ์„œ ์ œ๋ชฉ, ๋žญํฌ, ๋ณธ๋ฌธ ์ฒญํฌ), ๊ทธ๋ฆฌ๊ณ  ์‚ฌ์šฉ์ž ์ž…๋ ฅ์„ ์กฐํ•ฉํ•˜์—ฌ ๋†’์€ ํ’ˆ์งˆ์˜ ์‘๋‹ต์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.


์ฃผ์š” ํŠน์ง•

  1. RAG Generator ์ตœ์ ํ™”: ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜๋ฉฐ, ์ •๋ณด๋ฅผ ์„ ํƒ์ ์œผ๋กœ ํ™œ์šฉํ•˜์—ฌ ์‹ ๋ขฐ์„ฑ์„ ๋†’์˜€์Šต๋‹ˆ๋‹ค.
  2. ๊ฐ์ฃผ ์ฒ˜๋ฆฌ: ์ •๋ณด์˜ ์ถœ์ฒ˜๋ฅผ ๋ช…ํ™•ํžˆ ์•Œ ์ˆ˜ ์žˆ๋„๋ก ์ปจํ…์ŠคํŠธ์—์„œ ๊ฐ€์ ธ์˜จ ์ •๋ณด๋ฅผ ๊ฐ์ฃผ ํ˜•ํƒœ๋กœ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.
  3. ํ•œ๊ตญ์–ด ์ตœ์ ํ™”: ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ์— ํŠนํ™”๋œ ๋ชจ๋ธ๋กœ, ํ•œ๊ตญ์–ด ์‚ฌ์šฉ์ž์—๊ฒŒ ์ตœ์ ์˜ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉ๋ฒ•

1. ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ

Gemma-2 ๋ชจ๋ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ”„๋กฌํ”„ํŠธ ๊ตฌ์กฐ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

### Instruction:
{์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ}

### Conversation:
User: {์‚ฌ์šฉ์ž ๋Œ€ํ™”}
Assistant: {์–ด์‹œ์Šคํ„ดํŠธ ์‘๋‹ต}

### Context:
# Index [๋žญํฌ]: {๋ฌธ์„œ ์ œ๋ชฉ}
{๋ฌธ์„œ ์ฒญํฌ}

# Index [๋žญํฌ]: {๋ฌธ์„œ ์ œ๋ชฉ}
{๋ฌธ์„œ ์ฒญํฌ}

...

### Input:
{์‚ฌ์šฉ์ž ์ž…๋ ฅ}

### Response:
{๋ชจ๋ธ ์‘๋‹ต}

2. ๊ตฌ์„ฑ ์š”์†Œ ์„ค๋ช…

- Instruction

  • Gemma-2๊ฐ€ ๋”ฐ๋ฅด๋Š” ์‹œ์Šคํ…œ ์ง€์นจ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์— ๋‚ด์žฅ๋œ ๊ธฐ๋ณธ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, ํ•„์š”์— ๋”ฐ๋ผ ์ปค์Šคํ„ฐ๋งˆ์ด์ง• ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • ์˜ˆ:
    ๋‹น์‹ ์€ ์™ธ๋ถ€๊ฒ€์ƒ‰์„ ์ด์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋„์›€์„ ์ฃผ๋Š” ์ธ๊ณต์ง€๋Šฅ ์กฐ์ˆ˜์ž…๋‹ˆ๋‹ค.
    

- Conversation

  • ์‚ฌ์šฉ์ž์™€ ์–ด์‹œ์Šคํ„ดํŠธ ๊ฐ„์˜ ๋Œ€ํ™” ๊ธฐ๋ก์ž…๋‹ˆ๋‹ค. ์•„๋ž˜ ํ˜•์‹์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค:

    • ์‚ฌ์šฉ์ž: **User:**๋กœ ์‹œ์ž‘
    • ์–ด์‹œ์Šคํ„ดํŠธ: **Assistant:**๋กœ ์‹œ์ž‘
  • ์˜ˆ:

    User: ์„œ์šธ์˜ ๋‚ ์”จ๋Š” ์–ด๋•Œ์š”?
    Assistant: ์˜ค๋Š˜ ์„œ์šธ์€ ๋ง‘๊ณ  ํ™”์ฐฝํ•ฉ๋‹ˆ๋‹ค.
    

- Context

  • ์™ธ๋ถ€ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ๋žญํฌ, ๋ฌธ์„œ ์ œ๋ชฉ, ๋ณธ๋ฌธ ์ฒญํฌ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

  • ์ˆœ์„œ:

    1. ๋žญํฌ: # Index [๋žญํฌ]
    2. ๋ฌธ์„œ ์ œ๋ชฉ: {๋ฌธ์„œ ์ œ๋ชฉ}
    3. ๋ณธ๋ฌธ ์ฒญํฌ: {๋ฌธ์„œ ์ฒญํฌ}
  • ์˜ˆ:

    # Index [1]: ์„œ์šธ์˜ ๋‚ ์”จ
    ์„œ์šธ์€ ์˜ค๋Š˜ ๋ง‘๊ณ  ํ™”์ฐฝํ•˜๋ฉฐ ๋‚ฎ ์ตœ๊ณ  ๊ธฐ์˜จ์€ 25๋„์ž…๋‹ˆ๋‹ค.
    
    # Index [2]: ๊ธฐ์ƒ์ฒญ ์ •๋ณด
    ์ด๋ฒˆ ์ฃผ ์„œ์šธ์€ ์ „๋ฐ˜์ ์œผ๋กœ ๋ง‘์€ ๋‚ ์”จ๊ฐ€ ์ด์–ด์งˆ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
    

- Input

  • ์‚ฌ์šฉ์ž๊ฐ€ ์ถ”๊ฐ€๋กœ ์ž…๋ ฅํ•œ ์งˆ๋ฌธ์ด๋‚˜ ์š”์ฒญ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

- Response

  • ๋ชจ๋ธ์ด ์ƒ์„ฑํ•œ ๋‹ต๋ณ€์ด ๋“ค์–ด๊ฐ€๋Š” ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.

3. ์˜ˆ์‹œ

์ž…๋ ฅ ์˜ˆ์‹œ

### Instruction:
๋‹น์‹ ์€ ์™ธ๋ถ€๊ฒ€์ƒ‰์„ ์ด์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋„์›€์„ ์ฃผ๋Š” ์ธ๊ณต์ง€๋Šฅ ์กฐ์ˆ˜์ž…๋‹ˆ๋‹ค.

- Context๋Š” ์™ธ๋ถ€๊ฒ€์ƒ‰์„ ํ†ตํ•ด ๋ฐ˜ํ™˜๋œ ์‚ฌ์šฉ์ž ์š”์ฒญ๊ณผ ๊ด€๋ จ๋œ ์ •๋ณด๋“ค์ž…๋‹ˆ๋‹ค. 
- Context๋ฅผ ํ™œ์šฉํ•  ๋•Œ ๋ฌธ์žฅ ๋์— ์‚ฌ์šฉํ•œ ๋ฌธ์„œ ์กฐ๊ฐ์˜ [Index]๋ฅผ ๋ถ™์ด๊ณ  ์ž์—ฐ์Šค๋Ÿฌ์šด ๋‹ต๋ณ€์„ ์ž‘์„ฑํ•˜์„ธ์š”. (e.g. [1])
- Context์˜ ์ •๋ณด๊ฐ€ ์‚ฌ์šฉ์ž ์š”์ฒญ๊ณผ ๊ด€๋ จ์ด ์—†๊ฑฐ๋‚˜ ๋„์›€์ด ์•ˆ๋ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ด€๋ จ์žˆ๋Š” ์ •๋ณด๋งŒ ํ™œ์šฉํ•˜๊ณ , ์—†๋Š” ์ •๋ณด๋ฅผ ์ ˆ๋Œ€ ์ง€์–ด๋‚ด์ง€ ๋งˆ์„ธ์š”.
- ๋˜๋„๋ก์ด๋ฉด ์ผ๋ฐ˜ ์ง€์‹์œผ๋กœ ๋‹ต๋ณ€ํ•˜์ง€๋ง๊ณ , ์ตœ๋Œ€ํ•œ Context๋ฅผ ํ†ตํ•ด์„œ ๋‹ต๋ณ€์„ ํ•˜๋ ค๊ณ  ํ•˜์„ธ์š”. Context์— ์—†์„ ๊ฒฝ์šฐ์—๋Š” ์ด ์ ์„ ์–ธ๊ธ‰ํ•˜๋ฉฐ ์‚ฌ์ฃ„ํ•˜๊ณ  ๋‹ค๋ฅธ ์ฃผ์ œ๋‚˜ ์งˆ๋ฌธ์„ ์ถ”์ฒœํ•ด์ฃผ์„ธ์š”. 
- ์‚ฌ์šฉ์ž ์š”์ฒญ์— ์•Œ๋งž๋Š” ์ž์—ฐ์Šค๋Ÿฌ์šด ๋Œ€ํ™”๋ฅผ ํ•˜์„ธ์š”.
- ํ•ญ์ƒ ์กด๋Œ“๋ง๋กœ ๋‹ต๋ณ€ํ•˜์„ธ์š”.

### Conversation:
User: ์„œ์šธ์˜ ๋‚ ์”จ๋Š” ์–ด๋•Œ์š”?
Assistant: ์˜ค๋Š˜ ์„œ์šธ์€ ๋ง‘๊ณ  ํ™”์ฐฝํ•ฉ๋‹ˆ๋‹ค.

### Context:
# Index [1]: ์„œ์šธ์˜ ๋‚ ์”จ
์„œ์šธ์€ ์˜ค๋Š˜ ๋ง‘๊ณ  ํ™”์ฐฝํ•˜๋ฉฐ ๋‚ฎ ์ตœ๊ณ  ๊ธฐ์˜จ์€ 25๋„์ž…๋‹ˆ๋‹ค.

# Index [2]: ๊ธฐ์ƒ์ฒญ ์ •๋ณด
์ด๋ฒˆ ์ฃผ ์„œ์šธ์€ ์ „๋ฐ˜์ ์œผ๋กœ ๋ง‘์€ ๋‚ ์”จ๊ฐ€ ์ด์–ด์งˆ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

### Input:
์ด๋ฒˆ ์ฃผ๋ง์—๋Š” ๋น„๊ฐ€ ์˜ฌ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‚˜์š”?

### Response:
๊ธฐ์ƒ์ฒญ ์ •๋ณด์— ๋”ฐ๋ฅด๋ฉด, ์ด๋ฒˆ ์ฃผ ์„œ์šธ์€ ์ „๋ฐ˜์ ์œผ๋กœ ๋ง‘์€ ๋‚ ์”จ๊ฐ€ ์ด์–ด์งˆ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค. ์ฃผ๋ง์—๋„ ๋น„๊ฐ€ ์˜ฌ ๊ฐ€๋Šฅ์„ฑ์€ ๋‚ฎ์Šต๋‹ˆ๋‹ค. [2]

4. ์ปจํ…์ŠคํŠธ ์ถ”๊ฐ€ ๊ฐ€์ด๋“œ

  1. ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ์ˆ˜์ง‘ํ•˜์—ฌ Context์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  2. ๋žญํฌ๋Š” ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ์šฐ์„ ์ˆœ์œ„๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
  3. ๋ฌธ์„œ ์ œ๋ชฉ๊ณผ ๋ณธ๋ฌธ ์ฒญํฌ๋Š” ๊ฐ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ์ •๋ณด๋ฅผ ์••์ถ•ํ•˜์—ฌ ๋„ฃ์Šต๋‹ˆ๋‹ค.
  4. ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ํฌํ•จํ•˜๋ ค๋ฉด ์œ„ ์˜ˆ์‹œ์ฒ˜๋Ÿผ ์—ฌ๋Ÿฌ # Index ์„น์…˜์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

์ถ”๋ก  ๋ฐฉ๋ฒ•

  1. Hugging Face์—์„œ ๋ชจ๋ธ์„ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค:

    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    tokenizer = AutoTokenizer.from_pretrained("Austin9/gemma-2-2b-it-ko-generator-v2.5")
    model = AutoModelForCausalLM.from_pretrained("Austin9/gemma-2-2b-it-ko-generator-v2.5")
    
  2. ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

    prompt = """\
    ### Instruction:
     ๋‹น์‹ ์€ ์™ธ๋ถ€๊ฒ€์ƒ‰์„ ์ด์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋„์›€์„ ์ฃผ๋Š” ์ธ๊ณต์ง€๋Šฅ ์กฐ์ˆ˜์ž…๋‹ˆ๋‹ค.
     
     - Context๋Š” ์™ธ๋ถ€๊ฒ€์ƒ‰์„ ํ†ตํ•ด ๋ฐ˜ํ™˜๋œ ์‚ฌ์šฉ์ž ์š”์ฒญ๊ณผ ๊ด€๋ จ๋œ ์ •๋ณด๋“ค์ž…๋‹ˆ๋‹ค. 
     - Context๋ฅผ ํ™œ์šฉํ•  ๋•Œ ๋ฌธ์žฅ ๋์— ์‚ฌ์šฉํ•œ ๋ฌธ์„œ ์กฐ๊ฐ์˜ [Index]๋ฅผ ๋ถ™์ด๊ณ  ์ž์—ฐ์Šค๋Ÿฌ์šด ๋‹ต๋ณ€์„ ์ž‘์„ฑํ•˜์„ธ์š”. (e.g. [1])
     - Context์˜ ์ •๋ณด๊ฐ€ ์‚ฌ์šฉ์ž ์š”์ฒญ๊ณผ ๊ด€๋ จ์ด ์—†๊ฑฐ๋‚˜ ๋„์›€์ด ์•ˆ๋ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ด€๋ จ์žˆ๋Š” ์ •๋ณด๋งŒ ํ™œ์šฉํ•˜๊ณ , ์—†๋Š” ์ •๋ณด๋ฅผ ์ ˆ๋Œ€ ์ง€์–ด๋‚ด์ง€ ๋งˆ์„ธ์š”.
     - ๋˜๋„๋ก์ด๋ฉด ์ผ๋ฐ˜ ์ง€์‹์œผ๋กœ ๋‹ต๋ณ€ํ•˜์ง€๋ง๊ณ , ์ตœ๋Œ€ํ•œ Context๋ฅผ ํ†ตํ•ด์„œ ๋‹ต๋ณ€์„ ํ•˜๋ ค๊ณ  ํ•˜์„ธ์š”. Context์— ์—†์„ ๊ฒฝ์šฐ์—๋Š” ์ด ์ ์„ ์–ธ๊ธ‰ํ•˜๋ฉฐ ์‚ฌ์ฃ„ํ•˜๊ณ  ๋‹ค๋ฅธ ์ฃผ์ œ๋‚˜ ์งˆ๋ฌธ์„ ์ถ”์ฒœํ•ด์ฃผ์„ธ์š”. 
     - ์‚ฌ์šฉ์ž ์š”์ฒญ์— ์•Œ๋งž๋Š” ์ž์—ฐ์Šค๋Ÿฌ์šด ๋Œ€ํ™”๋ฅผ ํ•˜์„ธ์š”.
     - ํ•ญ์ƒ ์กด๋Œ“๋ง๋กœ ๋‹ต๋ณ€ํ•˜์„ธ์š”. 
    ### Conversation:
    User: ์„œ์šธ์˜ ๋‚ ์”จ๋Š” ์–ด๋•Œ์š”?
    Assistant: ์˜ค๋Š˜ ์„œ์šธ์€ ๋ง‘๊ณ  ํ™”์ฐฝํ•ฉ๋‹ˆ๋‹ค.
    
    ### Context:
    # Index [1]: ์„œ์šธ์˜ ๋‚ ์”จ
    ์„œ์šธ์€ ์˜ค๋Š˜ ๋ง‘๊ณ  ํ™”์ฐฝํ•˜๋ฉฐ ๋‚ฎ ์ตœ๊ณ  ๊ธฐ์˜จ์€ 25๋„์ž…๋‹ˆ๋‹ค.
    
    # Index [2]: ๊ธฐ์ƒ์ฒญ ์ •๋ณด
    ์ด๋ฒˆ ์ฃผ ์„œ์šธ์€ ์ „๋ฐ˜์ ์œผ๋กœ ๋ง‘์€ ๋‚ ์”จ๊ฐ€ ์ด์–ด์งˆ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
    
    ### Input:
    ์ด๋ฒˆ ์ฃผ๋ง์—๋Š” ๋น„๊ฐ€ ์˜ฌ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‚˜์š”?
    
    ### Response:
    """
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids
    
  3. ์ถ”๋ก ์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค:

    output = model.generate(input_ids, max_length=512, do_sample=True, top_p=0.1)
    response = tokenizer.decode(output[0], skip_special_tokens=True)
    print(response)
    

๋ฐ์ดํ„ฐ์…‹

์ด ๋ชจ๋ธ์€ ๋Œ€ํ™” ๊ธฐ๋ก, ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ, ์‚ฌ์šฉ์ž ์ž…๋ ฅ์„ ํฌํ•จํ•œ ํŠนํ™”๋œ RAG ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ์˜ ๊ฐ•์ 

  • ์ปจํ…์ŠคํŠธ๋ฅผ ํ™œ์šฉํ•œ ๋†’์€ ์‹ ๋ขฐ์„ฑ ์žˆ๋Š” ๋‹ต๋ณ€ ์ œ๊ณต.
  • ๊ฐ์ฃผ๋ฅผ ํ†ตํ•ด ์ถœ์ฒ˜๋ฅผ ๋ช…ํ™•ํžˆ ํ‘œ์‹œ.
  • ๋Œ€ํ™”ํ˜• ํƒœ์Šคํฌ์—์„œ ์œ ์—ฐํ•˜๊ณ  ์ž์—ฐ์Šค๋Ÿฌ์šด ์‘๋‹ต.

Gemma-2๋ฅผ ํ†ตํ•ด ํ•œ๊ตญ์–ด ๋Œ€ํ™” ์‹œ์Šคํ…œ์—์„œ์˜ ๊ฒ€์ƒ‰ ๊ธฐ๋ฐ˜ ์‘๋‹ต์„ ์ตœ์ ํ™”ํ•˜์„ธ์š”! ๐Ÿš€

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this modelโ€™s pipeline type. Check the docs .

Model tree for Austin9/gemma-2-2b-it-ko-generator-v2.5

Finetuned
(34)
this model