Edit model card

git-base-bdd100k

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4161
  • Wer Score: 2.3730

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 25
  • eval_batch_size: 25
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 50
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
10.5754 1.0 3 9.3460 6.9161
9.2089 2.0 6 8.8317 7.6376
8.694 3.0 9 8.2760 6.6661
8.1805 4.0 12 7.8390 6.1320
7.7782 5.0 15 7.4904 8.3372
7.4422 6.0 18 7.1838 8.7640
7.1465 7.0 21 6.9029 8.6650
6.8738 8.0 24 6.6334 8.4883
6.6096 9.0 27 6.3706 8.4497
6.3498 10.0 30 6.1129 6.2287
6.0934 11.0 33 5.8568 3.5906
5.8399 12.0 36 5.6030 2.8328
5.5876 13.0 39 5.3507 2.8227
5.3365 14.0 42 5.1022 2.7740
5.0873 15.0 45 4.8551 2.7008
4.8379 16.0 48 4.6113 2.5162
4.592 17.0 51 4.3680 2.5235
4.3471 18.0 54 4.1294 2.6834
4.1038 19.0 57 3.8936 2.6583
3.8678 20.0 60 3.6622 2.6454
3.6309 21.0 63 3.4319 2.5201
3.3979 22.0 66 3.2060 2.6728
3.1669 23.0 69 2.9855 2.6611
2.9417 24.0 72 2.7673 2.6023
2.7208 25.0 75 2.5581 2.5034
2.5046 26.0 78 2.3538 2.6320
2.2961 27.0 81 2.1537 2.5414
2.0931 28.0 84 1.9672 2.5682
1.8979 29.0 87 1.7863 2.5207
1.7123 30.0 90 1.6167 2.5872
1.5355 31.0 93 1.4592 2.5900
1.371 32.0 96 1.3143 2.5649
1.2175 33.0 99 1.1803 2.6079
1.0761 34.0 102 1.0583 2.6555
0.9483 35.0 105 0.9517 2.6264
0.8327 36.0 108 0.8592 2.6751
0.7304 37.0 111 0.7763 2.6465
0.6411 38.0 114 0.7093 2.8428
0.5596 39.0 117 0.6482 2.6230
0.4931 40.0 120 0.5953 2.7919
0.4341 41.0 123 0.5551 2.9049
0.3849 42.0 126 0.5176 2.4620
0.3397 43.0 129 0.4873 2.9083
0.3027 44.0 132 0.4661 2.7567
0.267 45.0 135 0.4511 2.7058
0.2371 46.0 138 0.4321 2.8031
0.2105 47.0 141 0.4201 2.4413
0.1883 48.0 144 0.4155 2.8026
0.1693 49.0 147 0.4023 2.5039
0.1543 50.0 150 0.3999 2.6532
0.1372 51.0 153 0.3925 2.4983
0.1249 52.0 156 0.3915 2.4866
0.1161 53.0 159 0.3911 2.3602
0.1054 54.0 162 0.3923 2.3054
0.0937 55.0 165 0.3859 2.3758
0.0849 56.0 168 0.3896 2.3126
0.0772 57.0 171 0.3902 2.3708
0.0703 58.0 174 0.3858 2.2416
0.0636 59.0 177 0.3896 2.1974
0.0576 60.0 180 0.3856 2.2411
0.053 61.0 183 0.3913 2.3647
0.0485 62.0 186 0.3932 2.2634
0.0447 63.0 189 0.3928 2.3730
0.0398 64.0 192 0.3920 2.4418
0.0368 65.0 195 0.3940 2.3742
0.0337 66.0 198 0.3922 2.2942
0.032 67.0 201 0.3969 2.2634
0.0293 68.0 204 0.3965 2.3026
0.0276 69.0 207 0.3995 2.3859
0.0258 70.0 210 0.4017 2.2668
0.0245 71.0 213 0.4058 2.2332
0.024 72.0 216 0.4019 2.4049
0.0225 73.0 219 0.4058 2.4055
0.0215 74.0 222 0.4048 2.3322
0.0201 75.0 225 0.4058 2.3070
0.0197 76.0 228 0.4071 2.3702
0.0186 77.0 231 0.4061 2.3753
0.0181 78.0 234 0.4076 2.3893
0.0175 79.0 237 0.4091 2.3853
0.017 80.0 240 0.4103 2.3798
0.0167 81.0 243 0.4128 2.3647
0.016 82.0 246 0.4129 2.3496
0.0159 83.0 249 0.4116 2.3255
0.0152 84.0 252 0.4127 2.3490
0.0152 85.0 255 0.4129 2.3820
0.0148 86.0 258 0.4134 2.3725
0.0143 87.0 261 0.4140 2.3389
0.0141 88.0 264 0.4149 2.3255
0.0139 89.0 267 0.4151 2.3786
0.0137 90.0 270 0.4150 2.4116
0.0135 91.0 273 0.4153 2.4027
0.0132 92.0 276 0.4158 2.4016
0.0134 93.0 279 0.4158 2.3803
0.0131 94.0 282 0.4156 2.3758
0.0131 95.0 285 0.4155 2.3647
0.013 96.0 288 0.4154 2.3669
0.0128 97.0 291 0.4157 2.3602
0.0127 98.0 294 0.4160 2.3669
0.0127 99.0 297 0.4161 2.3730
0.0127 100.0 300 0.4161 2.3730

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.1.0+cu118
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
107
Safetensors
Model size
177M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Trkkk/git-base-bdd100k

Base model

microsoft/git-base
Finetuned
(96)
this model