OpenELM-1_1B-DPO-full-random-pair
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 164.4180
- Rewards/chosen: -560.0
- Rewards/rejected: -482.0
- Rewards/accuracies: 0.4277
- Rewards/margins: -76.0
- Logps/rejected: -48640.0
- Logps/chosen: -56320.0
- Logits/rejected: 3.1562
- Logits/chosen: 2.7031
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.6914 | 0.1047 | 100 | 0.6966 | -0.2227 | -0.2197 | 0.4570 | -0.0034 | -310.0 | -340.0 | -9.875 | -10.25 |
0.6914 | 0.2094 | 200 | 4.2809 | -11.1875 | -10.375 | 0.4316 | -0.8555 | -1328.0 | -1440.0 | -8.6875 | -9.1875 |
0.6914 | 0.3141 | 300 | 95.7161 | -324.0 | -280.0 | 0.4258 | -44.0 | -28416.0 | -32768.0 | -2.5 | -2.5156 |
0.6914 | 0.4188 | 400 | 99.5534 | -338.0 | -292.0 | 0.4277 | -46.0 | -29440.0 | -34048.0 | -2.625 | -2.6406 |
0.6914 | 0.5236 | 500 | 103.5082 | -352.0 | -304.0 | 0.4277 | -48.0 | -30592.0 | -35328.0 | -1.9688 | -2.0469 |
0.6914 | 0.6283 | 600 | 107.6879 | -366.0 | -316.0 | 0.4316 | -50.0 | -31872.0 | -36864.0 | -1.1328 | -1.2656 |
0.6914 | 0.7330 | 700 | 111.8930 | -380.0 | -328.0 | 0.4297 | -52.0 | -33024.0 | -38400.0 | -0.5117 | -0.7031 |
0.6914 | 0.8377 | 800 | 116.2988 | -394.0 | -340.0 | 0.4355 | -54.0 | -34304.0 | -39680.0 | 1.3906 | 1.0781 |
0.6914 | 0.9424 | 900 | 120.7803 | -410.0 | -354.0 | 0.4316 | -55.75 | -35584.0 | -41216.0 | 1.9062 | 1.5391 |
0.6914 | 1.0471 | 1000 | 125.1435 | -424.0 | -366.0 | 0.4355 | -57.75 | -36864.0 | -42752.0 | 4.4688 | 3.9062 |
0.6914 | 1.1518 | 1100 | 129.3826 | -440.0 | -380.0 | 0.4316 | -59.75 | -38144.0 | -44288.0 | 4.0312 | 3.5156 |
0.6914 | 1.2565 | 1200 | 133.6557 | -454.0 | -392.0 | 0.4297 | -62.0 | -39424.0 | -45568.0 | 3.8438 | 3.2188 |
0.6914 | 1.3613 | 1300 | 137.5098 | -466.0 | -404.0 | 0.4355 | -63.5 | -40704.0 | -46848.0 | 2.1094 | 1.7891 |
0.6914 | 1.4660 | 1400 | 141.6271 | -482.0 | -416.0 | 0.4355 | -65.5 | -41728.0 | -48384.0 | 3.2344 | 2.7656 |
0.6914 | 1.5707 | 1500 | 145.0692 | -492.0 | -426.0 | 0.4336 | -67.0 | -42752.0 | -49664.0 | 3.4844 | 2.9844 |
0.6914 | 1.6754 | 1600 | 148.4839 | -504.0 | -436.0 | 0.4297 | -68.5 | -43776.0 | -50688.0 | 3.3281 | 2.8594 |
0.6914 | 1.7801 | 1700 | 151.1965 | -512.0 | -444.0 | 0.4316 | -69.5 | -44544.0 | -51712.0 | 3.7031 | 3.2188 |
0.6914 | 1.8848 | 1800 | 154.0215 | -524.0 | -452.0 | 0.4336 | -71.0 | -45568.0 | -52736.0 | 4.2188 | 3.6875 |
0.6914 | 1.9895 | 1900 | 156.4897 | -532.0 | -460.0 | 0.4316 | -72.5 | -46080.0 | -53504.0 | 3.3125 | 2.875 |
0.6914 | 2.0942 | 2000 | 158.3665 | -540.0 | -466.0 | 0.4336 | -73.0 | -46848.0 | -54016.0 | 3.1875 | 2.75 |
0.6914 | 2.1990 | 2100 | 160.3225 | -544.0 | -470.0 | 0.4297 | -74.0 | -47360.0 | -54784.0 | 3.3438 | 2.8906 |
0.6914 | 2.3037 | 2200 | 161.6044 | -548.0 | -474.0 | 0.4316 | -74.5 | -47616.0 | -55296.0 | 2.7344 | 2.3594 |
0.6914 | 2.4084 | 2300 | 162.5378 | -552.0 | -478.0 | 0.4316 | -75.0 | -48128.0 | -55552.0 | 2.8281 | 2.4062 |
0.6914 | 2.5131 | 2400 | 163.3184 | -556.0 | -480.0 | 0.4336 | -75.5 | -48128.0 | -55808.0 | 3.0 | 2.5469 |
0.6914 | 2.6178 | 2500 | 163.9196 | -556.0 | -482.0 | 0.4316 | -75.5 | -48384.0 | -56064.0 | 3.1875 | 2.75 |
0.6914 | 2.7225 | 2600 | 164.2697 | -556.0 | -482.0 | 0.4297 | -76.0 | -48640.0 | -56064.0 | 3.1719 | 2.7344 |
0.6914 | 2.8272 | 2700 | 164.3540 | -560.0 | -482.0 | 0.4297 | -76.0 | -48640.0 | -56064.0 | 3.1562 | 2.7188 |
0.6914 | 2.9319 | 2800 | 164.4180 | -560.0 | -482.0 | 0.4277 | -76.0 | -48640.0 | -56320.0 | 3.1562 | 2.7031 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.3.0
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 11
Inference API (serverless) does not yet support model repos that contain custom code.