gokuls's picture
End of training
7670d1c
metadata
language:
  - en
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - glue
metrics:
  - accuracy
  - f1
model-index:
  - name: mobilebert_sa_GLUE_Experiment_logit_kd_data_aug_mrpc
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: GLUE MRPC
          type: glue
          args: mrpc
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.9877450980392157
          - name: F1
            type: f1
            value: 0.9911190053285969

mobilebert_sa_GLUE_Experiment_logit_kd_data_aug_mrpc

This model is a fine-tuned version of google/mobilebert-uncased on the GLUE MRPC dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1256
  • Accuracy: 0.9877
  • F1: 0.9911
  • Combined Score: 0.9894

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 10
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Combined Score
0.2964 1.0 1959 0.2026 0.9608 0.9718 0.9663
0.2307 2.0 3918 0.1943 0.9706 0.9789 0.9748
0.2221 3.0 5877 0.1874 0.9804 0.9858 0.9831
0.2163 4.0 7836 0.1703 0.9853 0.9894 0.9873
0.2115 5.0 9795 0.1805 0.9853 0.9894 0.9873
0.2071 6.0 11754 0.1682 0.9804 0.9859 0.9831
0.2036 7.0 13713 0.1583 0.9877 0.9911 0.9894
0.2007 8.0 15672 0.1628 0.9926 0.9947 0.9936
0.1985 9.0 17631 0.1548 0.9853 0.9894 0.9873
0.1965 10.0 19590 0.1583 0.9877 0.9911 0.9894
0.195 11.0 21549 0.1527 0.9902 0.9928 0.9915
0.1938 12.0 23508 0.1512 0.9902 0.9929 0.9915
0.1926 13.0 25467 0.1426 0.9951 0.9964 0.9958
0.1917 14.0 27426 0.1436 0.9951 0.9964 0.9958
0.191 15.0 29385 0.1503 0.9926 0.9946 0.9936
0.1901 16.0 31344 0.1461 0.9951 0.9964 0.9958
0.1894 17.0 33303 0.1498 0.9975 0.9982 0.9979
0.1888 18.0 35262 0.1402 0.9902 0.9929 0.9915
0.1882 19.0 37221 0.1420 0.9926 0.9946 0.9936
0.1876 20.0 39180 0.1346 0.9902 0.9929 0.9915
0.1871 21.0 41139 0.1396 0.9951 0.9964 0.9958
0.1867 22.0 43098 0.1443 0.9951 0.9964 0.9958
0.1862 23.0 45057 0.1346 0.9926 0.9947 0.9936
0.1857 24.0 47016 0.1361 0.9951 0.9964 0.9958
0.1854 25.0 48975 0.1318 0.9926 0.9947 0.9936
0.185 26.0 50934 0.1310 0.9902 0.9929 0.9915
0.1846 27.0 52893 0.1302 0.9926 0.9947 0.9936
0.1842 28.0 54852 0.1329 0.9951 0.9964 0.9958
0.1839 29.0 56811 0.1300 0.9902 0.9929 0.9915
0.1836 30.0 58770 0.1328 0.9902 0.9929 0.9915
0.1832 31.0 60729 0.1327 0.9902 0.9929 0.9915
0.1829 32.0 62688 0.1308 0.9902 0.9929 0.9915
0.1826 33.0 64647 0.1287 0.9902 0.9929 0.9915
0.1824 34.0 66606 0.1309 0.9926 0.9947 0.9936
0.1821 35.0 68565 0.1309 0.9926 0.9947 0.9936
0.1818 36.0 70524 0.1271 0.9902 0.9929 0.9915
0.1816 37.0 72483 0.1278 0.9877 0.9911 0.9894
0.1813 38.0 74442 0.1280 0.9902 0.9929 0.9915
0.1811 39.0 76401 0.1289 0.9902 0.9929 0.9915
0.1809 40.0 78360 0.1290 0.9877 0.9911 0.9894
0.1807 41.0 80319 0.1256 0.9877 0.9911 0.9894
0.1805 42.0 82278 0.1268 0.9926 0.9947 0.9936
0.1803 43.0 84237 0.1274 0.9926 0.9947 0.9936
0.1801 44.0 86196 0.1277 0.9926 0.9947 0.9936
0.1799 45.0 88155 0.1264 0.9926 0.9947 0.9936
0.1797 46.0 90114 0.1274 0.9902 0.9929 0.9915

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.9.0
  • Tokenizers 0.13.2