Edit model card

GenerativeImage2Text-naruto

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0544
  • Wer Score: 2.6810

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
7.2829 1.8182 50 4.4333 11.5086
2.2118 3.6364 100 0.3655 1.1034
0.1089 5.4545 150 0.0428 1.0259
0.0223 7.2727 200 0.0421 0.4655
0.0162 9.0909 250 0.0430 0.4224
0.0139 10.9091 300 0.0434 0.9569
0.0126 12.7273 350 0.0455 0.8534
0.0115 14.5455 400 0.0457 3.2845
0.0106 16.3636 450 0.0490 2.3190
0.0096 18.1818 500 0.0515 2.2241
0.0094 20.0 550 0.0520 1.9569
0.0087 21.8182 600 0.0540 4.2328
0.0084 23.6364 650 0.0539 1.8448
0.008 25.4545 700 0.0546 2.5431
0.0077 27.2727 750 0.0542 2.7672
0.0072 29.0909 800 0.0544 2.6810

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
33
Safetensors
Model size
177M params
Tensor type
F32
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for ChiJuiChen/GenerativeImage2Text-naruto

Base model

microsoft/git-base
Finetuned
(103)
this model