Edit model card

swiftformer-xs-dmae-va-U-40B

This model is a fine-tuned version of MBZUAI/swiftformer-xs on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6013
  • Accuracy: 0.8257

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0.9 7 1.3937 0.2569
1.4259 1.94 15 1.3797 0.3211
1.4202 2.97 23 1.3603 0.3853
1.3885 4.0 31 1.3322 0.3945
1.368 4.9 38 1.3025 0.4312
1.3267 5.94 46 1.2571 0.4679
1.2772 6.97 54 1.2173 0.4771
1.2589 8.0 62 1.1662 0.5321
1.2271 8.9 69 1.0970 0.5596
1.1729 9.94 77 1.0780 0.5780
1.1035 10.97 85 1.0022 0.6239
1.0569 12.0 93 0.9465 0.6422
1.0468 12.9 100 0.9379 0.6606
0.9946 13.94 108 0.8879 0.6881
0.9565 14.97 116 0.8578 0.6972
0.9189 16.0 124 0.8488 0.7156
0.9063 16.9 131 0.8193 0.7339
0.9101 17.94 139 0.8115 0.7156
0.8486 18.97 147 0.7826 0.7156
0.8504 20.0 155 0.7473 0.7339
0.8244 20.9 162 0.7389 0.7706
0.8279 21.94 170 0.7209 0.7431
0.7721 22.97 178 0.6989 0.7706
0.7854 24.0 186 0.6974 0.7706
0.7546 24.9 193 0.6849 0.7615
0.7351 25.94 201 0.6921 0.7523
0.7351 26.97 209 0.6674 0.7982
0.7277 28.0 217 0.6534 0.8073
0.7328 28.9 224 0.6548 0.7890
0.7109 29.94 232 0.6332 0.7982
0.7 30.97 240 0.6227 0.7890
0.695 32.0 248 0.6157 0.8165
0.695 32.9 255 0.6177 0.8073
0.6916 33.94 263 0.6167 0.8073
0.7032 34.97 271 0.6013 0.8257
0.6797 36.0 279 0.6175 0.8073
0.6812 36.13 280 0.6167 0.7982

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
5
Safetensors
Model size
3.04M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Augusto777/swiftformer-xs-dmae-va-U-40B

Finetuned
(14)
this model