segformer-finetuned-tt-225-2k

This model is a fine-tuned version of nvidia/mit-b0 on the Saumya-Mundra/text255 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1299
  • Mean Iou: 0.4851
  • Mean Accuracy: 0.9702
  • Overall Accuracy: 0.9702
  • Accuracy Text: nan
  • Accuracy No Text: 0.9702
  • Iou Text: 0.0
  • Iou No Text: 0.9702

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 1337
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: polynomial
  • training_steps: 2000

Training results

Training Loss Epoch Step Validation Loss Mean Iou Mean Accuracy Overall Accuracy Accuracy Text Accuracy No Text Iou Text Iou No Text
No log 1.0 29 0.2604 0.4706 0.9411 0.9411 nan 0.9411 0.0 0.9411
No log 2.0 58 0.2094 0.4778 0.9557 0.9557 nan 0.9557 0.0 0.9557
No log 3.0 87 0.1818 0.4823 0.9647 0.9647 nan 0.9647 0.0 0.9647
0.323 4.0 116 0.2098 0.4696 0.9392 0.9392 nan 0.9392 0.0 0.9392
0.323 5.0 145 0.1717 0.4849 0.9699 0.9699 nan 0.9699 0.0 0.9699
0.323 6.0 174 0.1484 0.4840 0.9681 0.9681 nan 0.9681 0.0 0.9681
0.2027 7.0 203 0.1616 0.4815 0.9631 0.9631 nan 0.9631 0.0 0.9631
0.2027 8.0 232 0.1503 0.4864 0.9728 0.9728 nan 0.9728 0.0 0.9728
0.2027 9.0 261 0.1628 0.4783 0.9566 0.9566 nan 0.9566 0.0 0.9566
0.2027 10.0 290 0.1424 0.4835 0.9670 0.9670 nan 0.9670 0.0 0.9670
0.1693 11.0 319 0.1236 0.4903 0.9806 0.9806 nan 0.9806 0.0 0.9806
0.1693 12.0 348 0.1388 0.4816 0.9632 0.9632 nan 0.9632 0.0 0.9632
0.1693 13.0 377 0.1554 0.4788 0.9577 0.9577 nan 0.9577 0.0 0.9577
0.1502 14.0 406 0.1537 0.4761 0.9521 0.9521 nan 0.9521 0.0 0.9521
0.1502 15.0 435 0.1478 0.4781 0.9562 0.9562 nan 0.9562 0.0 0.9562
0.1502 16.0 464 0.1367 0.4825 0.9651 0.9651 nan 0.9651 0.0 0.9651
0.1502 17.0 493 0.1330 0.4851 0.9703 0.9703 nan 0.9703 0.0 0.9703
0.127 18.0 522 0.1322 0.4848 0.9696 0.9696 nan 0.9696 0.0 0.9696
0.127 19.0 551 0.1341 0.4842 0.9684 0.9684 nan 0.9684 0.0 0.9684
0.127 20.0 580 0.1244 0.4882 0.9764 0.9764 nan 0.9764 0.0 0.9764
0.1173 21.0 609 0.1200 0.4896 0.9793 0.9793 nan 0.9793 0.0 0.9793
0.1173 22.0 638 0.1374 0.4826 0.9653 0.9653 nan 0.9653 0.0 0.9653
0.1173 23.0 667 0.1248 0.4884 0.9768 0.9768 nan 0.9768 0.0 0.9768
0.1173 24.0 696 0.1262 0.4857 0.9714 0.9714 nan 0.9714 0.0 0.9714
0.1193 25.0 725 0.1235 0.4860 0.9720 0.9720 nan 0.9720 0.0 0.9720
0.1193 26.0 754 0.1323 0.4838 0.9677 0.9677 nan 0.9677 0.0 0.9677
0.1193 27.0 783 0.1235 0.4853 0.9707 0.9707 nan 0.9707 0.0 0.9707
0.0912 28.0 812 0.1349 0.4816 0.9632 0.9632 nan 0.9632 0.0 0.9632
0.0912 29.0 841 0.1408 0.4816 0.9632 0.9632 nan 0.9632 0.0 0.9632
0.0912 30.0 870 0.1206 0.4877 0.9754 0.9754 nan 0.9754 0.0 0.9754
0.0912 31.0 899 0.1347 0.4823 0.9646 0.9646 nan 0.9646 0.0 0.9646
0.1005 32.0 928 0.1188 0.4879 0.9758 0.9758 nan 0.9758 0.0 0.9758
0.1005 33.0 957 0.1234 0.4891 0.9781 0.9781 nan 0.9781 0.0 0.9781
0.1005 34.0 986 0.1206 0.4924 0.9849 0.9849 nan 0.9849 0.0 0.9849
0.0933 35.0 1015 0.1234 0.4895 0.9790 0.9790 nan 0.9790 0.0 0.9790
0.0933 36.0 1044 0.1192 0.4877 0.9753 0.9753 nan 0.9753 0.0 0.9753
0.0933 37.0 1073 0.1388 0.4807 0.9615 0.9615 nan 0.9615 0.0 0.9615
0.0937 38.0 1102 0.1255 0.4860 0.9719 0.9719 nan 0.9719 0.0 0.9719
0.0937 39.0 1131 0.1199 0.4877 0.9755 0.9755 nan 0.9755 0.0 0.9755
0.0937 40.0 1160 0.1194 0.4899 0.9799 0.9799 nan 0.9799 0.0 0.9799
0.0937 41.0 1189 0.1274 0.4844 0.9687 0.9687 nan 0.9687 0.0 0.9687
0.0923 42.0 1218 0.1302 0.4852 0.9703 0.9703 nan 0.9703 0.0 0.9703
0.0923 43.0 1247 0.1297 0.4854 0.9709 0.9709 nan 0.9709 0.0 0.9709
0.0923 44.0 1276 0.1299 0.4850 0.9701 0.9701 nan 0.9701 0.0 0.9701
0.0812 45.0 1305 0.1259 0.4867 0.9733 0.9733 nan 0.9733 0.0 0.9733
0.0812 46.0 1334 0.1252 0.4883 0.9766 0.9766 nan 0.9766 0.0 0.9766
0.0812 47.0 1363 0.1223 0.4881 0.9762 0.9762 nan 0.9762 0.0 0.9762
0.0812 48.0 1392 0.1227 0.4879 0.9757 0.9757 nan 0.9757 0.0 0.9757
0.0901 49.0 1421 0.1224 0.4880 0.9759 0.9759 nan 0.9759 0.0 0.9759
0.0901 50.0 1450 0.1350 0.4818 0.9636 0.9636 nan 0.9636 0.0 0.9636
0.0901 51.0 1479 0.1285 0.4859 0.9719 0.9719 nan 0.9719 0.0 0.9719
0.083 52.0 1508 0.1286 0.4848 0.9695 0.9695 nan 0.9695 0.0 0.9695
0.083 53.0 1537 0.1285 0.4850 0.9700 0.9700 nan 0.9700 0.0 0.9700
0.083 54.0 1566 0.1252 0.4888 0.9775 0.9775 nan 0.9775 0.0 0.9775
0.083 55.0 1595 0.1259 0.4867 0.9734 0.9734 nan 0.9734 0.0 0.9734
0.0822 56.0 1624 0.1373 0.4821 0.9642 0.9642 nan 0.9642 0.0 0.9642
0.0822 57.0 1653 0.1248 0.4860 0.9719 0.9719 nan 0.9719 0.0 0.9719
0.0822 58.0 1682 0.1228 0.4881 0.9761 0.9761 nan 0.9761 0.0 0.9761
0.0777 59.0 1711 0.1315 0.4838 0.9677 0.9677 nan 0.9677 0.0 0.9677
0.0777 60.0 1740 0.1254 0.4883 0.9766 0.9766 nan 0.9766 0.0 0.9766
0.0777 61.0 1769 0.1341 0.4841 0.9682 0.9682 nan 0.9682 0.0 0.9682
0.0777 62.0 1798 0.1269 0.4863 0.9726 0.9726 nan 0.9726 0.0 0.9726
0.079 63.0 1827 0.1266 0.4878 0.9755 0.9755 nan 0.9755 0.0 0.9755
0.079 64.0 1856 0.1304 0.4853 0.9706 0.9706 nan 0.9706 0.0 0.9706
0.079 65.0 1885 0.1253 0.4873 0.9747 0.9747 nan 0.9747 0.0 0.9747
0.0781 66.0 1914 0.1283 0.4866 0.9731 0.9731 nan 0.9731 0.0 0.9731
0.0781 67.0 1943 0.1290 0.4892 0.9784 0.9784 nan 0.9784 0.0 0.9784
0.0781 68.0 1972 0.1363 0.4835 0.9669 0.9669 nan 0.9669 0.0 0.9669
0.0826 68.9655 2000 0.1299 0.4851 0.9702 0.9702 nan 0.9702 0.0 0.9702

Framework versions

  • Transformers 4.49.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
15
Safetensors
Model size
3.72M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Saumya-Mundra/segformer-finetuned-tt-225-2k

Base model

nvidia/mit-b0
Finetuned
(353)
this model