hugodk-sch's picture
Model save
28afa8e verified
|
raw
history blame
4.81 kB
metadata
library_name: peft
tags:
  - trl
  - dpo
  - alignment-handbook
  - generated_from_trainer
base_model: NbAiLab/nb-gpt-j-6B-v2
model-index:
  - name: aftonposten-6b-align-scan
    results: []

aftonposten-6b-align-scan

This model is a fine-tuned version of NbAiLab/nb-gpt-j-6B-v2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2463.1057
  • Rewards/chosen: -0.0198
  • Rewards/rejected: -0.0244
  • Rewards/accuracies: 0.5656
  • Rewards/margins: 0.0046
  • Logps/rejected: -39.9582
  • Logps/chosen: -36.0192
  • Logits/rejected: -1.8605
  • Logits/chosen: -1.8648

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 4

Training results

Training Loss Epoch Step Logits/chosen Logits/rejected Logps/chosen Logps/rejected Validation Loss Rewards/accuracies Rewards/chosen Rewards/margins Rewards/rejected
2484.2775 0.26 100 -2.2323 -2.2275 -34.0187 -37.5204 2498.1597 0.5565 0.0002 0.0002 -0.0000
2464.7859 0.52 200 -2.2291 -2.2242 -34.0338 -37.5574 2496.0791 0.5544 0.0000 0.0004 -0.0004
2430.5529 0.78 300 -2.2262 -2.2213 -34.0629 -37.5906 2495.6680 0.5390 -0.0003 0.0005 -0.0007
2370.302 1.04 400 2492.4653 -0.0006 -0.0014 0.5623 0.0008 -37.6613 -34.0994 -2.1946 -2.1994
2298.9412 1.3 500 2488.4062 -0.0025 -0.0037 0.5772 0.0013 -37.8883 -34.2804 -2.1295 -2.1343
2298.7582 1.56 600 2477.6589 -0.0053 -0.0077 0.6121 0.0024 -38.2854 -34.5600 -2.0589 -2.0636
2254.2998 1.82 700 2477.6624 -0.0096 -0.0122 0.5540 0.0026 -38.7330 -34.9921 -2.0040 -2.0086
2173.4393 2.08 800 2470.5903 -0.0118 -0.0152 0.5569 0.0034 -39.0386 -35.2171 -1.9665 -1.9710
2065.36 2.34 900 2467.7329 -0.0156 -0.0195 0.5598 0.0039 -39.4668 -35.5978 -1.9122 -1.9166
2196.3246 2.6 1000 2464.3682 -0.0186 -0.0229 0.5511 0.0044 -39.8091 -35.8901 -1.8782 -1.8826
2237.6512 2.86 1100 2464.2991 -0.0196 -0.0241 0.5453 0.0044 -39.9232 -35.9976 -1.8646 -1.8690
2032.5133 3.12 1200 2464.4653 -0.0199 -0.0243 0.5598 0.0044 -39.9447 -36.0201 -1.8618 -1.8662
1967.852 3.38 1300 2461.2036 -0.0197 -0.0245 0.5540 0.0048 -39.9632 -36.0051 -1.8608 -1.8652
2084.0914 3.64 1400 2462.6121 -0.0199 -0.0245 0.5598 0.0046 -39.9675 -36.0243 -1.8613 -1.8656
2032.7156 3.9 1500 2463.1057 -0.0198 -0.0244 0.5656 0.0046 -39.9582 -36.0192 -1.8605 -1.8648

Framework versions

  • PEFT 0.8.2
  • Transformers 4.37.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.1