pythia-410m-deduped
This model is a fine-tuned version of EleutherAI/pythia-410m-deduped on the princeton-nlp/llama3-ultrafeedback dataset. It achieves the following results on the evaluation set:
- Loss: 1.6928
- Original Losses: 1.7344
- Weight: 1.0
- Abs Diff: 0.3008
- Rewards/chosen: -5.4375
- Rewards/rejected: -5.4688
- Rewards/accuracies: 0.4758
- Rewards/margins: 0.0228
- Logps/rejected: -2.1875
- Logps/chosen: -2.1719
- Logits/rejected: 5.7188
- Logits/chosen: 5.7188
- All Logps 1: -811.2697
- All Logps 1 Values: -811.2697
- All Logps 2: 447.4254
- All Logps 2 Values: 447.4254
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-06
- train_batch_size: 2
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 8
- total_train_batch_size: 128
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Original Losses | Weight | Abs Diff | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | All Logps 1 | All Logps 1 Values | All Logps 2 | All Logps 2 Values |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1.9439 | 0.0427 | 20 | 1.7861 | 1.8125 | 1.0 | 0.3574 | -4.9688 | -5.0 | 0.4556 | 0.0187 | -1.9922 | -1.9844 | 5.1875 | 5.2188 | -694.3344 | -694.3344 | 447.4254 | 447.4254 |
1.8637 | 0.0855 | 40 | 1.7850 | 1.8125 | 1.0 | 0.3574 | -4.9688 | -4.9688 | 0.4637 | 0.0112 | -1.9922 | -1.9844 | 5.1875 | 5.25 | -694.3014 | -694.3014 | 447.4254 | 447.4254 |
1.8856 | 0.1282 | 60 | 1.7741 | 1.8125 | 1.0 | 0.3496 | -4.9375 | -4.9375 | 0.4435 | -0.0004 | -1.9766 | -1.9766 | 5.2188 | 5.25 | -695.6515 | -695.6515 | 447.4254 | 447.4254 |
1.8193 | 0.1710 | 80 | 1.7628 | 1.8047 | 1.0 | 0.3477 | -4.9375 | -4.9375 | 0.4637 | 0.0016 | -1.9844 | -1.9766 | 5.3125 | 5.3438 | -699.6716 | -699.6716 | 447.4254 | 447.4254 |
1.8542 | 0.2137 | 100 | 1.7501 | 1.7891 | 1.0 | 0.3340 | -4.9375 | -4.9688 | 0.4758 | 0.0138 | -1.9844 | -1.9766 | 5.4062 | 5.4375 | -707.3261 | -707.3261 | 447.4254 | 447.4254 |
1.7907 | 0.2565 | 120 | 1.7458 | 1.7891 | 1.0 | 0.3301 | -5.0 | -4.9688 | 0.4315 | -0.0052 | -1.9922 | -1.9922 | 5.4688 | 5.5 | -714.8251 | -714.8251 | 447.4254 | 447.4254 |
1.8332 | 0.2992 | 140 | 1.7375 | 1.7969 | 1.0 | 0.3281 | -5.0312 | -5.0 | 0.4637 | -0.0200 | -2.0 | -2.0156 | 5.5312 | 5.5625 | -723.8403 | -723.8403 | 447.4254 | 447.4254 |
1.7599 | 0.3420 | 160 | 1.7328 | 1.7969 | 1.0 | 0.3301 | -5.0938 | -5.0625 | 0.4355 | -0.0156 | -2.0312 | -2.0312 | 5.5625 | 5.5938 | -734.5149 | -734.5149 | 447.4254 | 447.4254 |
1.8462 | 0.3847 | 180 | 1.7246 | 1.7734 | 1.0 | 0.3184 | -5.125 | -5.125 | 0.4516 | -0.0015 | -2.0469 | -2.0469 | 5.5625 | 5.5938 | -745.0103 | -745.0103 | 447.4254 | 447.4254 |
1.8253 | 0.4275 | 200 | 1.7154 | 1.7656 | 1.0 | 0.3145 | -5.1562 | -5.1875 | 0.4476 | 0.0043 | -2.0625 | -2.0625 | 5.5625 | 5.5938 | -755.3181 | -755.3181 | 447.4254 | 447.4254 |
1.8056 | 0.4702 | 220 | 1.7119 | 1.7734 | 1.0 | 0.3203 | -5.2188 | -5.2188 | 0.4476 | 0.0032 | -2.0938 | -2.0938 | 5.5938 | 5.625 | -762.7902 | -762.7902 | 447.4254 | 447.4254 |
1.7958 | 0.5130 | 240 | 1.7096 | 1.7734 | 1.0 | 0.3164 | -5.25 | -5.25 | 0.4556 | -0.0002 | -2.1094 | -2.1094 | 5.5938 | 5.625 | -770.9695 | -770.9695 | 447.4254 | 447.4254 |
1.7141 | 0.5557 | 260 | 1.7073 | 1.7578 | 1.0 | 0.3086 | -5.2812 | -5.2812 | 0.4355 | 0.0052 | -2.1094 | -2.1094 | 5.625 | 5.625 | -775.2407 | -775.2407 | 447.4254 | 447.4254 |
1.7021 | 0.5985 | 280 | 1.7085 | 1.7656 | 1.0 | 0.3125 | -5.2812 | -5.2812 | 0.4597 | -0.0014 | -2.1094 | -2.1094 | 5.625 | 5.6562 | -778.4560 | -778.4560 | 447.4254 | 447.4254 |
1.7788 | 0.6412 | 300 | 1.7020 | 1.7578 | 1.0 | 0.3066 | -5.3125 | -5.3125 | 0.4677 | 0.0104 | -2.125 | -2.125 | 5.6562 | 5.6875 | -784.0049 | -784.0049 | 447.4254 | 447.4254 |
1.679 | 0.6839 | 320 | 1.7053 | 1.7578 | 1.0 | 0.3105 | -5.3438 | -5.3438 | 0.4476 | 0.0002 | -2.1406 | -2.1406 | 5.6562 | 5.6875 | -791.0703 | -791.0703 | 447.4254 | 447.4254 |
1.751 | 0.7267 | 340 | 1.7006 | 1.7578 | 1.0 | 0.3105 | -5.375 | -5.4062 | 0.4919 | 0.0085 | -2.1562 | -2.1562 | 5.6562 | 5.6875 | -797.0882 | -797.0882 | 447.4254 | 447.4254 |
1.7191 | 0.7694 | 360 | 1.6990 | 1.7656 | 1.0 | 0.3086 | -5.4375 | -5.4062 | 0.4476 | -0.0044 | -2.1719 | -2.1719 | 5.6875 | 5.6875 | -803.0909 | -803.0909 | 447.4254 | 447.4254 |
1.7226 | 0.8122 | 380 | 1.6993 | 1.7578 | 1.0 | 0.3086 | -5.4375 | -5.4375 | 0.4758 | 0.0093 | -2.1719 | -2.1719 | 5.6875 | 5.7188 | -806.9357 | -806.9357 | 447.4254 | 447.4254 |
1.7198 | 0.8549 | 400 | 1.6968 | 1.7578 | 1.0 | 0.3066 | -5.4688 | -5.4688 | 0.4556 | 0.0020 | -2.1875 | -2.1875 | 5.6875 | 5.7188 | -810.5368 | -810.5368 | 447.4254 | 447.4254 |
1.7057 | 0.8977 | 420 | 1.6963 | 1.75 | 1.0 | 0.3047 | -5.4688 | -5.4688 | 0.4718 | 0.0151 | -2.1875 | -2.1875 | 5.6875 | 5.7188 | -811.7772 | -811.7772 | 447.4254 | 447.4254 |
1.75 | 0.9404 | 440 | 1.6973 | 1.7578 | 1.0 | 0.3086 | -5.4688 | -5.4688 | 0.4677 | 0.0077 | -2.1875 | -2.1875 | 5.6875 | 5.7188 | -811.8970 | -811.8970 | 447.4254 | 447.4254 |
1.6912 | 0.9832 | 460 | 1.6928 | 1.7344 | 1.0 | 0.3008 | -5.4375 | -5.4688 | 0.4758 | 0.0228 | -2.1875 | -2.1719 | 5.7188 | 5.7188 | -811.2697 | -811.2697 | 447.4254 | 447.4254 |
Framework versions
- Transformers 4.42.3
- Pytorch 2.2.2+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 6
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for RAY2L/pythia-410m-deduped-SimPOW-1
Base model
EleutherAI/pythia-410m-deduped