nadahlberg's picture
initial
1339be4
|
raw
history blame
14.9 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: deberta-v3-large-ddlm
    results: []

deberta-v3-large-ddlm

This model is a fine-tuned version of microsoft/deberta-v3-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5241

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
0.9823 0.01 1000 0.9163
0.8817 0.02 2000 0.9022
0.9647 0.03 3000 0.8879
0.8646 0.04 4000 0.8577
0.9159 0.06 5000 0.8677
0.8449 0.07 6000 0.8221
0.8681 0.08 7000 0.8332
0.8738 0.09 8000 0.8334
0.8638 0.1 9000 0.8236
0.9066 0.11 10000 0.8200
0.8686 0.12 11000 0.8092
0.7736 0.13 12000 0.8199
0.8054 0.14 13000 0.7972
0.8934 0.16 14000 0.7998
0.7884 0.17 15000 0.7895
0.8278 0.18 16000 0.7586
0.8482 0.19 17000 0.7562
0.8716 0.2 18000 0.7819
0.8881 0.21 19000 0.7878
0.8397 0.22 20000 0.7989
0.811 0.23 21000 0.7846
0.7762 0.24 22000 0.7753
0.7778 0.25 23000 0.7878
0.737 0.27 24000 0.7473
0.8451 0.28 25000 0.7460
0.823 0.29 26000 0.7300
0.7472 0.3 27000 0.7292
0.8048 0.31 28000 0.7697
0.7962 0.32 29000 0.7359
0.8048 0.33 30000 0.7409
0.8095 0.34 31000 0.7434
0.7451 0.35 32000 0.7534
0.6997 0.37 33000 0.7602
0.8116 0.38 34000 0.7566
0.7963 0.39 35000 0.7245
0.786 0.4 36000 0.7311
0.7991 0.41 37000 0.7230
0.723 0.42 38000 0.7209
0.789 0.43 39000 0.7418
0.7296 0.44 40000 0.7325
0.7363 0.45 41000 0.7134
0.758 0.47 42000 0.6948
0.711 0.48 43000 0.6992
0.7984 0.49 44000 0.7055
0.8402 0.5 45000 0.7108
0.8553 0.51 46000 0.7005
0.7538 0.52 47000 0.7208
0.7169 0.53 48000 0.7291
0.7345 0.54 49000 0.7195
0.758 0.55 50000 0.6694
0.7868 0.56 51000 0.6938
0.6966 0.58 52000 0.6867
0.7389 0.59 53000 0.6862
0.7529 0.6 54000 0.7175
0.7345 0.61 55000 0.6970
0.766 0.62 56000 0.7017
0.7043 0.63 57000 0.6916
0.6474 0.64 58000 0.7129
0.7456 0.65 59000 0.6802
0.7512 0.66 60000 0.6951
0.6816 0.68 61000 0.7072
0.7206 0.69 62000 0.6967
0.6439 0.7 63000 0.6798
0.7309 0.71 64000 0.7163
0.6925 0.72 65000 0.6794
0.6833 0.73 66000 0.6637
0.6643 0.74 67000 0.6855
0.6433 0.75 68000 0.7035
0.7595 0.76 69000 0.7008
0.7214 0.78 70000 0.6618
0.7111 0.79 71000 0.6850
0.7375 0.8 72000 0.6909
0.6779 0.81 73000 0.7042
0.6646 0.82 74000 0.6634
0.6616 0.83 75000 0.7020
0.6762 0.84 76000 0.6638
0.7509 0.85 77000 0.6541
0.6963 0.86 78000 0.6781
0.6949 0.87 79000 0.6576
0.6781 0.89 80000 0.6900
0.65 0.9 81000 0.6835
0.7205 0.91 82000 0.6712
0.6901 0.92 83000 0.6699
0.6972 0.93 84000 0.6456
0.7041 0.94 85000 0.6497
0.6864 0.95 86000 0.6432
0.7308 0.96 87000 0.6497
0.6886 0.97 88000 0.6674
0.6947 0.99 89000 0.6638
0.6567 1.0 90000 0.6242
0.7185 1.01 91000 0.6704
0.7435 1.02 92000 0.6681
0.7108 1.03 93000 0.6619
0.6942 1.04 94000 0.6306
0.6998 1.05 95000 0.6409
0.6481 1.06 96000 0.6476
0.727 1.07 97000 0.6354
0.647 1.09 98000 0.6222
0.6622 1.1 99000 0.6119
0.6346 1.11 100000 0.6471
0.6203 1.12 101000 0.6655
0.6765 1.13 102000 0.6473
0.6703 1.14 103000 0.6308
0.6793 1.15 104000 0.6531
0.683 1.16 105000 0.6693
0.6654 1.17 106000 0.6241
0.6626 1.18 107000 0.6215
0.6976 1.2 108000 0.6479
0.7494 1.21 109000 0.6345
0.691 1.22 110000 0.6322
0.6568 1.23 111000 0.6265
0.705 1.24 112000 0.6281
0.6307 1.25 113000 0.6202
0.6828 1.26 114000 0.6158
0.6403 1.27 115000 0.6495
0.6615 1.28 116000 0.6298
0.6237 1.3 117000 0.6234
0.6707 1.31 118000 0.6267
0.6823 1.32 119000 0.6299
0.6333 1.33 120000 0.6169
0.685 1.34 121000 0.6371
0.6941 1.35 122000 0.6245
0.6358 1.36 123000 0.6291
0.6754 1.37 124000 0.6400
0.6286 1.38 125000 0.6148
0.7036 1.4 126000 0.6033
0.645 1.41 127000 0.6295
0.6578 1.42 128000 0.6348
0.651 1.43 129000 0.6222
0.5558 1.44 130000 0.6231
0.6601 1.45 131000 0.6302
0.6304 1.46 132000 0.6127
0.6177 1.47 133000 0.6047
0.5933 1.48 134000 0.6169
0.6307 1.49 135000 0.6012
0.6018 1.51 136000 0.5900
0.6724 1.52 137000 0.6086
0.6367 1.53 138000 0.6414
0.6515 1.54 139000 0.6267
0.5902 1.55 140000 0.5913
0.6523 1.56 141000 0.5992
0.6005 1.57 142000 0.6128
0.6179 1.58 143000 0.6089
0.6154 1.59 144000 0.6353
0.6298 1.61 145000 0.5997
0.5623 1.62 146000 0.5974
0.5787 1.63 147000 0.6165
0.6099 1.64 148000 0.6246
0.658 1.65 149000 0.6116
0.6567 1.66 150000 0.5938
0.6227 1.67 151000 0.5948
0.5858 1.68 152000 0.5822
0.6227 1.69 153000 0.5802
0.6699 1.71 154000 0.6067
0.5989 1.72 155000 0.6073
0.6184 1.73 156000 0.6124
0.6404 1.74 157000 0.6169
0.639 1.75 158000 0.5997
0.6433 1.76 159000 0.5989
0.5574 1.77 160000 0.5796
0.5983 1.78 161000 0.6036
0.6532 1.79 162000 0.5888
0.6679 1.8 163000 0.6038
0.62 1.82 164000 0.5984
0.5541 1.83 165000 0.6003
0.6192 1.84 166000 0.5786
0.6613 1.85 167000 0.6064
0.5923 1.86 168000 0.6018
0.5894 1.87 169000 0.5912
0.6462 1.88 170000 0.5902
0.5811 1.89 171000 0.6030
0.6358 1.9 172000 0.5915
0.614 1.92 173000 0.5886
0.5969 1.93 174000 0.6084
0.6146 1.94 175000 0.6003
0.6051 1.95 176000 0.5835
0.6268 1.96 177000 0.5999
0.6436 1.97 178000 0.5965
0.6167 1.98 179000 0.5789
0.5647 1.99 180000 0.5669
0.6038 2.0 181000 0.6009
0.6082 2.02 182000 0.5799
0.6483 2.03 183000 0.5716
0.5503 2.04 184000 0.5806
0.6231 2.05 185000 0.5699
0.5892 2.06 186000 0.5979
0.5933 2.07 187000 0.5709
0.594 2.08 188000 0.5719
0.5838 2.09 189000 0.5879
0.6039 2.1 190000 0.5984
0.5911 2.11 191000 0.5953
0.563 2.13 192000 0.5772
0.5671 2.14 193000 0.5771
0.6051 2.15 194000 0.5972
0.5852 2.16 195000 0.5917
0.5757 2.17 196000 0.5819
0.6557 2.18 197000 0.5655
0.6055 2.19 198000 0.5820
0.6067 2.2 199000 0.5801
0.6422 2.21 200000 0.5590
0.624 2.23 201000 0.5573
0.6222 2.24 202000 0.5661
0.5597 2.25 203000 0.5786
0.5746 2.26 204000 0.5622
0.6269 2.27 205000 0.5804
0.6241 2.28 206000 0.5696
0.6519 2.29 207000 0.5367
0.6161 2.3 208000 0.5666
0.5415 2.31 209000 0.5633
0.633 2.33 210000 0.5499
0.5566 2.34 211000 0.5822
0.6158 2.35 212000 0.5826
0.5574 2.36 213000 0.5429
0.5748 2.37 214000 0.5736
0.5818 2.38 215000 0.5599
0.6226 2.39 216000 0.5407
0.5733 2.4 217000 0.5759
0.6268 2.41 218000 0.5725
0.5885 2.42 219000 0.5771
0.5708 2.44 220000 0.5654
0.5783 2.45 221000 0.5756
0.61 2.46 222000 0.5647
0.5848 2.47 223000 0.5532
0.5869 2.48 224000 0.5519
0.5717 2.49 225000 0.5621
0.5675 2.5 226000 0.5446
0.6321 2.51 227000 0.5812
0.568 2.52 228000 0.5673
0.5577 2.54 229000 0.5590
0.5888 2.55 230000 0.5628
0.6389 2.56 231000 0.5828
0.5782 2.57 232000 0.5543
0.5871 2.58 233000 0.5575
0.5593 2.59 234000 0.5625
0.6167 2.6 235000 0.5450
0.5828 2.61 236000 0.5627
0.5411 2.62 237000 0.5498
0.6168 2.64 238000 0.5891
0.6508 2.65 239000 0.5811
0.6322 2.66 240000 0.5649
0.6131 2.67 241000 0.5473
0.5419 2.68 242000 0.5583
0.5685 2.69 243000 0.5635
0.5267 2.7 244000 0.5481
0.5357 2.71 245000 0.5474
0.585 2.72 246000 0.5281
0.5894 2.73 247000 0.5457
0.5665 2.75 248000 0.5579
0.5409 2.76 249000 0.5412
0.6051 2.77 250000 0.5447
0.5866 2.78 251000 0.5535
0.5348 2.79 252000 0.5377
0.5606 2.8 253000 0.5524
0.5142 2.81 254000 0.5441
0.543 2.82 255000 0.5499
0.5763 2.83 256000 0.5241

Framework versions

  • Transformers 4.20.0.dev0
  • Pytorch 1.10.0+cu102
  • Datasets 1.15.1
  • Tokenizers 0.11.0