length_seed-42_1e-3

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1073
  • Accuracy: 0.4167

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 32000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
6.3921 0.9999 1983 4.6383 0.2613
4.4107 1.9999 3966 3.9370 0.3262
3.8167 2.9998 5949 3.6319 0.3567
3.5468 3.9997 7932 3.4810 0.3708
3.3994 4.9997 9915 3.3998 0.3796
3.3048 5.9996 11898 3.3454 0.3847
3.24 6.9996 13881 3.3108 0.3890
3.1934 8.0 15865 3.2871 0.3915
3.1609 8.9999 17848 3.2674 0.3935
3.132 9.9999 19831 3.2552 0.3947
3.1124 10.9998 21814 3.2466 0.3960
3.0954 11.9997 23797 3.2362 0.3975
3.0835 12.9997 25780 3.2282 0.3985
3.0746 13.9996 27763 3.2232 0.3990
3.0675 14.9996 29746 3.2189 0.3995
3.0623 16.0 31730 3.2167 0.3995
3.0446 16.9999 33713 3.1815 0.4041
2.9722 17.9999 35696 3.1497 0.4089
2.8843 18.9998 37679 3.1224 0.4133
2.7816 19.9987 39660 3.1073 0.4167

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.20.0
Downloads last month
10
Safetensors
Model size
97.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.