art-gpt2-base

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.0920

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
6.6246 3.23 100 6.0652
5.5673 6.45 200 5.5941
4.8544 9.68 300 5.2210
4.1301 12.9 400 4.9281
3.4252 16.13 500 4.8584
2.8133 19.35 600 4.8369
2.2897 22.58 700 4.8968
1.8635 25.81 800 5.0623
1.4989 29.03 900 5.1647
1.1677 32.26 1000 5.3719
0.9198 35.48 1100 5.4282
0.7353 38.71 1200 5.6292
0.6025 41.94 1300 5.6874
0.5122 45.16 1400 5.7219
0.432 48.39 1500 5.8266
0.3801 51.61 1600 5.8598
0.3457 54.84 1700 5.9109
0.3131 58.06 1800 5.9386
0.2904 61.29 1900 5.9634
0.265 64.52 2000 5.9652
0.2526 67.74 2100 5.9944
0.2363 70.97 2200 6.0083
0.2276 74.19 2300 6.0417
0.2155 77.42 2400 6.0281
0.2083 80.65 2500 6.0560
0.2056 83.87 2600 6.0612
0.2008 87.1 2700 6.0770
0.1958 90.32 2800 6.0843
0.192 93.55 2900 6.0831
0.1889 96.77 3000 6.0930
0.1881 100.0 3100 6.0920

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
7
Safetensors
Model size
93.1M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for bencyc1129/art-gpt2-base

Finetuned
(1331)
this model