--- library_name: peft tags: - trl - dpo - alignment-handbook - generated_from_trainer base_model: NbAiLab/nb-gpt-j-6B-v2 model-index: - name: aftonposten-6b-align-scan results: [] --- # aftonposten-6b-align-scan This model is a fine-tuned version of [NbAiLab/nb-gpt-j-6B-v2](https://huggingface.co./NbAiLab/nb-gpt-j-6B-v2) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 2463.1057 - Rewards/chosen: -0.0198 - Rewards/rejected: -0.0244 - Rewards/accuracies: 0.5656 - Rewards/margins: 0.0046 - Logps/rejected: -39.9582 - Logps/chosen: -36.0192 - Logits/rejected: -1.8605 - Logits/chosen: -1.8648 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-06 - train_batch_size: 4 - eval_batch_size: 8 - seed: 42 - distributed_type: multi-GPU - gradient_accumulation_steps: 2 - total_train_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 4 ### Training results | Training Loss | Epoch | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected | |:-------------:|:-----:|:----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:------------------:|:--------------:|:---------------:|:----------------:| | 2484.2775 | 0.26 | 100 | -2.2323 | -2.2275 | -34.0187 | -37.5204 | 2498.1597 | 0.5565 | 0.0002 | 0.0002 | -0.0000 | | 2464.7859 | 0.52 | 200 | -2.2291 | -2.2242 | -34.0338 | -37.5574 | 2496.0791 | 0.5544 | 0.0000 | 0.0004 | -0.0004 | | 2430.5529 | 0.78 | 300 | -2.2262 | -2.2213 | -34.0629 | -37.5906 | 2495.6680 | 0.5390 | -0.0003 | 0.0005 | -0.0007 | | 2370.302 | 1.04 | 400 | 2492.4653 | -0.0006 | -0.0014 | 0.5623 | 0.0008 | -37.6613 | -34.0994 | -2.1946 | -2.1994 | | 2298.9412 | 1.3 | 500 | 2488.4062 | -0.0025 | -0.0037 | 0.5772 | 0.0013 | -37.8883 | -34.2804 | -2.1295 | -2.1343 | | 2298.7582 | 1.56 | 600 | 2477.6589 | -0.0053 | -0.0077 | 0.6121 | 0.0024 | -38.2854 | -34.5600 | -2.0589 | -2.0636 | | 2254.2998 | 1.82 | 700 | 2477.6624 | -0.0096 | -0.0122 | 0.5540 | 0.0026 | -38.7330 | -34.9921 | -2.0040 | -2.0086 | | 2173.4393 | 2.08 | 800 | 2470.5903 | -0.0118 | -0.0152 | 0.5569 | 0.0034 | -39.0386 | -35.2171 | -1.9665 | -1.9710 | | 2065.36 | 2.34 | 900 | 2467.7329 | -0.0156 | -0.0195 | 0.5598 | 0.0039 | -39.4668 | -35.5978 | -1.9122 | -1.9166 | | 2196.3246 | 2.6 | 1000 | 2464.3682 | -0.0186 | -0.0229 | 0.5511 | 0.0044 | -39.8091 | -35.8901 | -1.8782 | -1.8826 | | 2237.6512 | 2.86 | 1100 | 2464.2991 | -0.0196 | -0.0241 | 0.5453 | 0.0044 | -39.9232 | -35.9976 | -1.8646 | -1.8690 | | 2032.5133 | 3.12 | 1200 | 2464.4653 | -0.0199 | -0.0243 | 0.5598 | 0.0044 | -39.9447 | -36.0201 | -1.8618 | -1.8662 | | 1967.852 | 3.38 | 1300 | 2461.2036 | -0.0197 | -0.0245 | 0.5540 | 0.0048 | -39.9632 | -36.0051 | -1.8608 | -1.8652 | | 2084.0914 | 3.64 | 1400 | 2462.6121 | -0.0199 | -0.0245 | 0.5598 | 0.0046 | -39.9675 | -36.0243 | -1.8613 | -1.8656 | | 2032.7156 | 3.9 | 1500 | 2463.1057 | -0.0198 | -0.0244 | 0.5656 | 0.0046 | -39.9582 | -36.0192 | -1.8605 | -1.8648 | ### Framework versions - PEFT 0.8.2 - Transformers 4.37.2 - Pytorch 2.1.2+cu121 - Datasets 2.17.0 - Tokenizers 0.15.1