Edit model card

gollm-instruct-all-in-one-v1

This model is a fine-tuned version of EleutherAI/polyglot-ko-12.8b on a custom mixed dataset

Model description

  • No-context template
μ•„λž˜λŠ” μž‘μ—…μ„ μ„€λͺ…ν•˜λŠ” μ§ˆλ¬Έμ–΄μ™€ μΆ”κ°€ μ»¨ν…μŠ€νŠΈλ₯Ό μ œκ³΅ν•˜λŠ” λ§₯락이 ν•¨κ»˜ μ œκ³΅λ©λ‹ˆλ‹€. μš”μ²­μ„ 적절히 μ™„λ£Œν•˜λŠ” 닡변을 μž‘μ„±ν•˜μ„Έμš”.

### 질문:
{instruction}

### λ‹΅λ³€:
  • With context template
μ•„λž˜λŠ” μž‘μ—…μ„ μ„€λͺ…ν•˜λŠ” μ§ˆλ¬Έμ–΄μ™€ μΆ”κ°€ μ»¨ν…μŠ€νŠΈλ₯Ό μ œκ³΅ν•˜λŠ” λ§₯락이 ν•¨κ»˜ μ œκ³΅λ©λ‹ˆλ‹€. μš”μ²­μ„ 적절히 μ™„λ£Œν•˜λŠ” 닡변을 μž‘μ„±ν•˜μ„Έμš”.

### λ§₯락:
{input}

### 질문:
{instruction}

### λ‹΅λ³€:

Intended uses & limitations

More information needed

Training and evaluation data

  • self-introduction (20 samples)
  • Combined KoAlpaca and KULLM - no-context samples only (145.8k samples)
    • KoAlpaca v1.0
    • KoAlpaca v1.1
    • KULLM (Dolly and Vicuna only)
  • Naver news summarization (22.2k samples)
  • KLUE MRC (17.5k samples)
  • KLUE STS (5.6k samples)

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8
  • saved_checkpoint_at_epoch: 4 (condition: loss < 0.3)

Training results

Training Loss Epoch Step
1.5688 1.0 11947
1.0424 2.0 23895
0.5542 3.0 35843
0.2548 4.0 47791
0.1479 5.0 59738

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.0+cu117
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
4,264
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for tlphams/gollm-12.8b-instruct-v2.0

Finetuned
(11)
this model