Edit model card

bert-twitch-emotes

This model is a fine-tuned version of bert-base-cased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9711

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
5.5495 1.0 165 4.6986
4.6634 2.0 330 3.8592
4.3587 3.0 495 nan
4.0264 4.0 660 3.6984
3.5892 5.0 825 nan
3.7685 6.0 990 3.8205
3.3314 7.0 1155 3.7551
3.3483 8.0 1320 3.4611
3.1391 9.0 1485 3.4480
2.9446 10.0 1650 3.4619
2.918 11.0 1815 3.5393
2.7561 12.0 1980 3.4080
2.9775 13.0 2145 3.5564
2.6634 14.0 2310 nan
2.6358 15.0 2475 nan
2.5163 16.0 2640 nan
2.6054 17.0 2805 nan
2.3735 18.0 2970 nan
2.5114 19.0 3135 3.1434
2.3668 20.0 3300 nan
2.3008 21.0 3465 3.4670
2.1246 22.0 3630 3.2320
2.1345 23.0 3795 nan
2.1379 24.0 3960 3.4308
2.3027 25.0 4125 nan
2.2912 26.0 4290 3.8554
2.0533 27.0 4455 nan
2.0029 28.0 4620 3.6411
2.0801 29.0 4785 3.3145
1.9242 30.0 4950 3.8866
1.8617 31.0 5115 nan
1.7359 32.0 5280 3.6191
1.7567 33.0 5445 3.4290
1.5816 34.0 5610 nan
1.6193 35.0 5775 3.6239
1.7555 36.0 5940 3.8294
1.5949 37.0 6105 nan
1.5287 38.0 6270 nan
1.7165 39.0 6435 3.9250
1.4955 40.0 6600 nan
1.5768 41.0 6765 3.5979
1.4145 42.0 6930 4.3696
1.3059 43.0 7095 3.8834
1.4716 44.0 7260 3.8716
1.4055 45.0 7425 3.5523
1.231 46.0 7590 3.5275
1.2732 47.0 7755 3.5954
1.4179 48.0 7920 3.9041
1.3136 49.0 8085 3.7871
1.4072 50.0 8250 3.7365
1.2083 51.0 8415 3.9303
1.2625 52.0 8580 3.6428
1.1998 53.0 8745 3.7615
1.224 54.0 8910 nan
1.0882 55.0 9075 3.9539
1.1847 56.0 9240 4.2988
1.1108 57.0 9405 nan
1.1376 58.0 9570 nan
1.073 59.0 9735 nan
0.9939 60.0 9900 4.3302
1.0409 61.0 10065 3.9883
1.1136 62.0 10230 3.9496
1.0054 63.0 10395 nan
0.8614 64.0 10560 3.8033
0.9833 65.0 10725 nan
1.0471 66.0 10890 3.7918
0.9947 67.0 11055 nan
1.0328 68.0 11220 3.8372
1.0071 69.0 11385 nan
0.9151 70.0 11550 nan
1.114 71.0 11715 4.0105
0.8184 72.0 11880 nan
1.1402 73.0 12045 4.1616
1.0118 74.0 12210 4.1463
1.0308 75.0 12375 4.1884
0.8101 76.0 12540 nan
0.9185 77.0 12705 3.8719
0.9617 78.0 12870 3.5809
0.7833 79.0 13035 3.6456
0.8115 80.0 13200 4.0877
0.9021 81.0 13365 4.5326
0.7471 82.0 13530 3.6400
0.862 83.0 13695 4.4030
0.9765 84.0 13860 nan
0.8329 85.0 14025 4.2526
0.7722 86.0 14190 nan
0.7735 87.0 14355 nan
0.736 88.0 14520 nan
0.9181 89.0 14685 nan
0.7734 90.0 14850 4.2560
0.7839 91.0 15015 nan
0.7987 92.0 15180 4.1558
0.8994 93.0 15345 4.0585
0.8252 94.0 15510 3.5777
0.6863 95.0 15675 4.1786
0.8669 96.0 15840 3.8966
0.792 97.0 16005 3.9439
0.681 98.0 16170 3.5610
0.8098 99.0 16335 nan
0.7167 100.0 16500 nan

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for soravoid/bert-twitch-emotes

Finetuned
(1913)
this model