qwen2.5-0.5b-expo-EXDPO-WEIGHT-BETA0.2
This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise dataset. It achieves the following results on the evaluation set:
- Loss: 0.3417
- Logps: -93.8085
- Logits: -1.1885
- Objective: 0.3403
- Dpo Loss: 0.7083
- Regularize: 0.2695
- Ranking Simple: 0.5186
- Ranking Idealized: 0.5399
- Ranking Idealized Expo: 0.5243
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-07
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 3
- gradient_accumulation_steps: 8
- total_train_batch_size: 96
- total_eval_batch_size: 12
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Logps | Logits | Objective | Dpo Loss | Regularize | Ranking Simple | Ranking Idealized | Ranking Idealized Expo |
---|---|---|---|---|---|---|---|---|---|---|---|
0.0798 | 0.0945 | 50 | 0.0807 | -98.5040 | -1.3072 | 0.0808 | 0.6932 | 0.0115 | 0.5238 | 0.5399 | 0.5243 |
0.081 | 0.1889 | 100 | 0.0819 | -98.4932 | -1.3084 | 0.0819 | 0.6934 | 0.0125 | 0.5238 | 0.5399 | 0.5243 |
0.0839 | 0.2834 | 150 | 0.0823 | -98.5417 | -1.3079 | 0.0823 | 0.6931 | 0.0130 | 0.5233 | 0.5399 | 0.5243 |
0.0891 | 0.3779 | 200 | 0.0840 | -98.6517 | -1.3063 | 0.0839 | 0.6927 | 0.0147 | 0.5238 | 0.5399 | 0.5243 |
0.1019 | 0.4724 | 250 | 0.0865 | -98.6753 | -1.3058 | 0.0864 | 0.6929 | 0.0171 | 0.5233 | 0.5399 | 0.5243 |
0.1094 | 0.5668 | 300 | 0.0928 | -98.3448 | -1.3087 | 0.0930 | 0.6926 | 0.0238 | 0.5238 | 0.5399 | 0.5243 |
0.1267 | 0.6613 | 350 | 0.0995 | -98.4803 | -1.3097 | 0.1004 | 0.6942 | 0.0310 | 0.5243 | 0.5399 | 0.5243 |
0.1414 | 0.7558 | 400 | 0.1020 | -98.6999 | -1.3138 | 0.1027 | 0.6920 | 0.0335 | 0.5248 | 0.5399 | 0.5243 |
0.156 | 0.8503 | 450 | 0.1102 | -98.6961 | -1.3001 | 0.1107 | 0.6917 | 0.0415 | 0.5228 | 0.5399 | 0.5243 |
0.1843 | 0.9447 | 500 | 0.1425 | -98.3685 | -1.2985 | 0.1416 | 0.6934 | 0.0723 | 0.5217 | 0.5399 | 0.5243 |
0.1954 | 1.0392 | 550 | 0.1388 | -98.2336 | -1.3154 | 0.1383 | 0.6946 | 0.0689 | 0.5228 | 0.5399 | 0.5243 |
0.2073 | 1.1337 | 600 | 0.1374 | -98.7215 | -1.3071 | 0.1369 | 0.6936 | 0.0676 | 0.5228 | 0.5399 | 0.5243 |
0.2165 | 1.2282 | 650 | 0.1478 | -97.9261 | -1.2926 | 0.1487 | 0.6916 | 0.0796 | 0.5233 | 0.5399 | 0.5243 |
0.2333 | 1.3226 | 700 | 0.1470 | -97.1071 | -1.2930 | 0.1450 | 0.6915 | 0.0758 | 0.5243 | 0.5399 | 0.5243 |
0.229 | 1.4171 | 750 | 0.1718 | -97.0923 | -1.2689 | 0.1725 | 0.6929 | 0.1032 | 0.5238 | 0.5399 | 0.5243 |
0.2565 | 1.5116 | 800 | 0.1817 | -97.2621 | -1.2540 | 0.1830 | 0.6944 | 0.1136 | 0.5243 | 0.5399 | 0.5243 |
0.2479 | 1.6060 | 850 | 0.1864 | -96.3423 | -1.2708 | 0.1853 | 0.6946 | 0.1159 | 0.5243 | 0.5399 | 0.5243 |
0.2586 | 1.7005 | 900 | 0.1839 | -97.2157 | -1.2623 | 0.1825 | 0.6944 | 0.1131 | 0.5223 | 0.5399 | 0.5243 |
0.2347 | 1.7950 | 950 | 0.1995 | -94.8402 | -1.2678 | 0.1989 | 0.6945 | 0.1295 | 0.5238 | 0.5399 | 0.5243 |
0.2414 | 1.8895 | 1000 | 0.1895 | -95.8793 | -1.2579 | 0.1901 | 0.6924 | 0.1209 | 0.5254 | 0.5399 | 0.5243 |
0.2433 | 1.9839 | 1050 | 0.2097 | -95.7970 | -1.2552 | 0.2068 | 0.6923 | 0.1376 | 0.5259 | 0.5399 | 0.5243 |
0.2393 | 2.0784 | 1100 | 0.2156 | -96.9313 | -1.2422 | 0.2149 | 0.6962 | 0.1452 | 0.5264 | 0.5399 | 0.5243 |
0.2476 | 2.1729 | 1150 | 0.2195 | -95.8618 | -1.2485 | 0.2191 | 0.6958 | 0.1495 | 0.5238 | 0.5399 | 0.5243 |
0.2443 | 2.2674 | 1200 | 0.2318 | -97.1362 | -1.2241 | 0.2317 | 0.6998 | 0.1617 | 0.5259 | 0.5399 | 0.5243 |
0.2337 | 2.3618 | 1250 | 0.2494 | -96.2629 | -1.2313 | 0.2515 | 0.6950 | 0.1820 | 0.5269 | 0.5399 | 0.5243 |
0.2264 | 2.4563 | 1300 | 0.2473 | -94.4504 | -1.2535 | 0.2456 | 0.6981 | 0.1758 | 0.5223 | 0.5399 | 0.5243 |
0.2398 | 2.5508 | 1350 | 0.2467 | -96.2065 | -1.2349 | 0.2462 | 0.7027 | 0.1760 | 0.5197 | 0.5399 | 0.5243 |
0.2346 | 2.6453 | 1400 | 0.2565 | -94.6591 | -1.2562 | 0.2567 | 0.7002 | 0.1867 | 0.5212 | 0.5399 | 0.5243 |
0.242 | 2.7397 | 1450 | 0.2640 | -94.6555 | -1.2141 | 0.2641 | 0.7015 | 0.1939 | 0.5243 | 0.5399 | 0.5243 |
0.2372 | 2.8342 | 1500 | 0.2747 | -94.9289 | -1.2472 | 0.2726 | 0.7027 | 0.2024 | 0.5202 | 0.5399 | 0.5243 |
0.2133 | 2.9287 | 1550 | 0.2529 | -95.1991 | -1.2345 | 0.2512 | 0.7006 | 0.1811 | 0.5243 | 0.5399 | 0.5243 |
0.2292 | 3.0231 | 1600 | 0.2840 | -93.6334 | -1.2437 | 0.2861 | 0.7038 | 0.2157 | 0.5197 | 0.5399 | 0.5243 |
0.2227 | 3.1176 | 1650 | 0.2854 | -93.4763 | -1.2332 | 0.2851 | 0.7025 | 0.2149 | 0.5217 | 0.5399 | 0.5243 |
0.2123 | 3.2121 | 1700 | 0.2752 | -95.6906 | -1.2311 | 0.2756 | 0.7008 | 0.2055 | 0.5233 | 0.5399 | 0.5243 |
0.218 | 3.3066 | 1750 | 0.2800 | -95.9042 | -1.2167 | 0.2783 | 0.7037 | 0.2079 | 0.5238 | 0.5399 | 0.5243 |
0.2086 | 3.4010 | 1800 | 0.2945 | -95.6983 | -1.2183 | 0.2932 | 0.7027 | 0.2230 | 0.5233 | 0.5399 | 0.5243 |
0.216 | 3.4955 | 1850 | 0.2895 | -93.0784 | -1.2235 | 0.2873 | 0.7028 | 0.2171 | 0.5212 | 0.5399 | 0.5243 |
0.2182 | 3.5900 | 1900 | 0.2973 | -95.2384 | -1.2138 | 0.2977 | 0.7019 | 0.2275 | 0.5207 | 0.5399 | 0.5243 |
0.2097 | 3.6845 | 1950 | 0.3023 | -93.4940 | -1.2111 | 0.3000 | 0.7046 | 0.2295 | 0.5217 | 0.5399 | 0.5243 |
0.2076 | 3.7789 | 2000 | 0.3084 | -93.0939 | -1.2337 | 0.3067 | 0.7034 | 0.2364 | 0.5243 | 0.5399 | 0.5243 |
0.2099 | 3.8734 | 2050 | 0.2962 | -93.1727 | -1.2280 | 0.2954 | 0.7044 | 0.2249 | 0.5212 | 0.5399 | 0.5243 |
0.2001 | 3.9679 | 2100 | 0.3139 | -93.9210 | -1.2079 | 0.3123 | 0.7063 | 0.2417 | 0.5186 | 0.5399 | 0.5243 |
0.2082 | 4.0624 | 2150 | 0.3119 | -93.6768 | -1.2148 | 0.3124 | 0.7037 | 0.2420 | 0.5217 | 0.5399 | 0.5243 |
0.1914 | 4.1568 | 2200 | 0.3139 | -94.5737 | -1.2179 | 0.3138 | 0.7032 | 0.2434 | 0.5197 | 0.5399 | 0.5243 |
0.2026 | 4.2513 | 2250 | 0.3179 | -93.2220 | -1.2044 | 0.3177 | 0.7035 | 0.2473 | 0.5202 | 0.5399 | 0.5243 |
0.1908 | 4.3458 | 2300 | 0.3067 | -94.3151 | -1.2117 | 0.3085 | 0.7022 | 0.2383 | 0.5233 | 0.5399 | 0.5243 |
0.1931 | 4.4402 | 2350 | 0.3241 | -93.4124 | -1.2066 | 0.3236 | 0.7058 | 0.2530 | 0.5223 | 0.5399 | 0.5243 |
0.195 | 4.5347 | 2400 | 0.3111 | -94.2419 | -1.2062 | 0.3113 | 0.7035 | 0.2410 | 0.5217 | 0.5399 | 0.5243 |
0.1947 | 4.6292 | 2450 | 0.3312 | -93.6715 | -1.1956 | 0.3317 | 0.7067 | 0.2610 | 0.5228 | 0.5399 | 0.5243 |
0.1837 | 4.7237 | 2500 | 0.3289 | -93.6179 | -1.2041 | 0.3304 | 0.7077 | 0.2596 | 0.5223 | 0.5399 | 0.5243 |
0.1751 | 4.8181 | 2550 | 0.3254 | -93.4709 | -1.1993 | 0.3247 | 0.7060 | 0.2541 | 0.5212 | 0.5399 | 0.5243 |
0.1717 | 4.9126 | 2600 | 0.3287 | -94.2886 | -1.2078 | 0.3292 | 0.7050 | 0.2587 | 0.5207 | 0.5399 | 0.5243 |
0.1761 | 5.0071 | 2650 | 0.3257 | -93.6210 | -1.2055 | 0.3239 | 0.7061 | 0.2533 | 0.5217 | 0.5399 | 0.5243 |
0.1692 | 5.1016 | 2700 | 0.3396 | -93.0109 | -1.2063 | 0.3378 | 0.7072 | 0.2670 | 0.5223 | 0.5399 | 0.5243 |
0.1676 | 5.1960 | 2750 | 0.3402 | -93.9591 | -1.1978 | 0.3384 | 0.7084 | 0.2675 | 0.5202 | 0.5399 | 0.5243 |
0.1743 | 5.2905 | 2800 | 0.3371 | -93.9100 | -1.1972 | 0.3351 | 0.7076 | 0.2643 | 0.5217 | 0.5399 | 0.5243 |
0.1715 | 5.3850 | 2850 | 0.3408 | -93.6808 | -1.1939 | 0.3405 | 0.7084 | 0.2696 | 0.5212 | 0.5399 | 0.5243 |
0.1643 | 5.4795 | 2900 | 0.3434 | -93.0381 | -1.1941 | 0.3434 | 0.7095 | 0.2724 | 0.5192 | 0.5399 | 0.5243 |
0.1569 | 5.5739 | 2950 | 0.3403 | -94.4489 | -1.1993 | 0.3406 | 0.7083 | 0.2698 | 0.5192 | 0.5399 | 0.5243 |
0.16 | 5.6684 | 3000 | 0.3337 | -94.1339 | -1.1952 | 0.3332 | 0.7068 | 0.2625 | 0.5233 | 0.5399 | 0.5243 |
0.1556 | 5.7629 | 3050 | 0.3379 | -93.7011 | -1.1943 | 0.3366 | 0.7075 | 0.2658 | 0.5197 | 0.5399 | 0.5243 |
0.1544 | 5.8573 | 3100 | 0.3407 | -93.8059 | -1.1896 | 0.3385 | 0.7082 | 0.2677 | 0.5212 | 0.5399 | 0.5243 |
0.1539 | 5.9518 | 3150 | 0.3377 | -93.3647 | -1.2013 | 0.3358 | 0.7079 | 0.2650 | 0.5207 | 0.5399 | 0.5243 |
0.1448 | 6.0463 | 3200 | 0.3418 | -93.0674 | -1.1912 | 0.3402 | 0.7086 | 0.2693 | 0.5181 | 0.5399 | 0.5243 |
0.1479 | 6.1408 | 3250 | 0.3437 | -93.1651 | -1.1883 | 0.3423 | 0.7079 | 0.2715 | 0.5217 | 0.5399 | 0.5243 |
0.1408 | 6.2352 | 3300 | 0.3427 | -93.4029 | -1.1821 | 0.3405 | 0.7074 | 0.2698 | 0.5197 | 0.5399 | 0.5243 |
0.1475 | 6.3297 | 3350 | 0.3401 | -93.6032 | -1.1856 | 0.3383 | 0.7078 | 0.2675 | 0.5192 | 0.5399 | 0.5243 |
0.1339 | 6.4242 | 3400 | 0.3415 | -93.5229 | -1.1891 | 0.3402 | 0.7082 | 0.2693 | 0.5212 | 0.5399 | 0.5243 |
0.1394 | 6.5187 | 3450 | 0.3398 | -94.0518 | -1.1959 | 0.3379 | 0.7083 | 0.2671 | 0.5186 | 0.5399 | 0.5243 |
0.1324 | 6.6131 | 3500 | 0.3401 | -93.9466 | -1.1836 | 0.3389 | 0.7075 | 0.2682 | 0.5192 | 0.5399 | 0.5243 |
0.1385 | 6.7076 | 3550 | 0.3449 | -93.6245 | -1.1866 | 0.3437 | 0.7080 | 0.2729 | 0.5202 | 0.5399 | 0.5243 |
0.1289 | 6.8021 | 3600 | 0.3433 | -93.8482 | -1.1858 | 0.3412 | 0.7088 | 0.2703 | 0.5192 | 0.5399 | 0.5243 |
0.1272 | 6.8966 | 3650 | 0.3431 | -93.9371 | -1.1979 | 0.3417 | 0.7080 | 0.2709 | 0.5202 | 0.5399 | 0.5243 |
0.125 | 6.9910 | 3700 | 0.3436 | -93.9666 | -1.1952 | 0.3425 | 0.7079 | 0.2717 | 0.5202 | 0.5399 | 0.5243 |
0.1227 | 7.0855 | 3750 | 0.3404 | -93.8781 | -1.2022 | 0.3382 | 0.7086 | 0.2674 | 0.5197 | 0.5399 | 0.5243 |
0.1142 | 7.1800 | 3800 | 0.3426 | -93.8234 | -1.1874 | 0.3420 | 0.7083 | 0.2712 | 0.5207 | 0.5399 | 0.5243 |
0.1142 | 7.2744 | 3850 | 0.3454 | -93.6895 | -1.1775 | 0.3442 | 0.7090 | 0.2733 | 0.5202 | 0.5399 | 0.5243 |
0.1128 | 7.3689 | 3900 | 0.3417 | -94.0521 | -1.1838 | 0.3406 | 0.7083 | 0.2698 | 0.5197 | 0.5399 | 0.5243 |
0.1158 | 7.4634 | 3950 | 0.3434 | -93.9208 | -1.1875 | 0.3423 | 0.7086 | 0.2714 | 0.5197 | 0.5399 | 0.5243 |
0.113 | 7.5579 | 4000 | 0.3428 | -93.6866 | -1.1850 | 0.3411 | 0.7087 | 0.2702 | 0.5197 | 0.5399 | 0.5243 |
0.1113 | 7.6523 | 4050 | 0.3434 | -93.6171 | -1.1837 | 0.3425 | 0.7087 | 0.2716 | 0.5202 | 0.5399 | 0.5243 |
0.1082 | 7.7468 | 4100 | 0.3411 | -94.0013 | -1.1852 | 0.3403 | 0.7081 | 0.2695 | 0.5192 | 0.5399 | 0.5243 |
0.1051 | 7.8413 | 4150 | 0.3425 | -93.8552 | -1.1848 | 0.3417 | 0.7083 | 0.2709 | 0.5197 | 0.5399 | 0.5243 |
0.1047 | 7.9358 | 4200 | 0.3422 | -93.6696 | -1.1872 | 0.3411 | 0.7085 | 0.2702 | 0.5197 | 0.5399 | 0.5243 |
0.0985 | 8.0302 | 4250 | 0.3416 | -93.6924 | -1.1844 | 0.3403 | 0.7083 | 0.2695 | 0.5197 | 0.5399 | 0.5243 |
0.0964 | 8.1247 | 4300 | 0.3422 | -93.5025 | -1.1871 | 0.3409 | 0.7082 | 0.2701 | 0.5202 | 0.5399 | 0.5243 |
0.0997 | 8.2192 | 4350 | 0.3423 | -93.8074 | -1.1866 | 0.3408 | 0.7081 | 0.2700 | 0.5186 | 0.5399 | 0.5243 |
0.0963 | 8.3137 | 4400 | 0.3434 | -93.6885 | -1.1861 | 0.3419 | 0.7084 | 0.2711 | 0.5202 | 0.5399 | 0.5243 |
0.0966 | 8.4081 | 4450 | 0.3434 | -93.7312 | -1.1875 | 0.3419 | 0.7084 | 0.2711 | 0.5186 | 0.5399 | 0.5243 |
0.0956 | 8.5026 | 4500 | 0.3431 | -93.8431 | -1.1866 | 0.3416 | 0.7081 | 0.2708 | 0.5186 | 0.5399 | 0.5243 |
0.0928 | 8.5971 | 4550 | 0.3428 | -93.8243 | -1.1859 | 0.3414 | 0.7084 | 0.2706 | 0.5186 | 0.5399 | 0.5243 |
0.0924 | 8.6915 | 4600 | 0.3418 | -93.7706 | -1.1871 | 0.3406 | 0.7082 | 0.2698 | 0.5186 | 0.5399 | 0.5243 |
0.0908 | 8.7860 | 4650 | 0.3415 | -93.7405 | -1.1872 | 0.3403 | 0.7079 | 0.2695 | 0.5202 | 0.5399 | 0.5243 |
0.0922 | 8.8805 | 4700 | 0.3419 | -93.7126 | -1.1888 | 0.3405 | 0.7078 | 0.2698 | 0.5202 | 0.5399 | 0.5243 |
0.0895 | 8.9750 | 4750 | 0.3417 | -93.7926 | -1.1886 | 0.3402 | 0.7080 | 0.2694 | 0.5202 | 0.5399 | 0.5243 |
0.0877 | 9.0694 | 4800 | 0.3425 | -93.7523 | -1.1891 | 0.3415 | 0.7083 | 0.2706 | 0.5197 | 0.5399 | 0.5243 |
0.0862 | 9.1639 | 4850 | 0.3423 | -93.8492 | -1.1894 | 0.3406 | 0.7082 | 0.2698 | 0.5207 | 0.5399 | 0.5243 |
0.0856 | 9.2584 | 4900 | 0.3417 | -93.8453 | -1.1883 | 0.3404 | 0.7081 | 0.2696 | 0.5197 | 0.5399 | 0.5243 |
0.0883 | 9.3529 | 4950 | 0.3414 | -93.8773 | -1.1886 | 0.3401 | 0.7080 | 0.2693 | 0.5202 | 0.5399 | 0.5243 |
0.0866 | 9.4473 | 5000 | 0.3414 | -93.8593 | -1.1880 | 0.3402 | 0.7081 | 0.2694 | 0.5197 | 0.5399 | 0.5243 |
0.0843 | 9.5418 | 5050 | 0.3417 | -93.8241 | -1.1880 | 0.3405 | 0.7081 | 0.2697 | 0.5207 | 0.5399 | 0.5243 |
0.0862 | 9.6363 | 5100 | 0.3419 | -93.8268 | -1.1884 | 0.3404 | 0.7081 | 0.2696 | 0.5197 | 0.5399 | 0.5243 |
0.0851 | 9.7308 | 5150 | 0.3418 | -93.8247 | -1.1881 | 0.3405 | 0.7082 | 0.2697 | 0.5192 | 0.5399 | 0.5243 |
0.0852 | 9.8252 | 5200 | 0.3415 | -93.8257 | -1.1886 | 0.3402 | 0.7081 | 0.2694 | 0.5197 | 0.5399 | 0.5243 |
0.0873 | 9.9197 | 5250 | 0.3418 | -93.8220 | -1.1885 | 0.3404 | 0.7083 | 0.2696 | 0.5197 | 0.5399 | 0.5243 |
Framework versions
- Transformers 4.42.0
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
- Downloads last month
- 16
Model tree for hZzy/qwen2.5-0.5b-expo-EXDPO-WEIGHT-BETA0.2
Base model
hZzy/qwen2.5-0.5b-sft-news-IFT