nova_v1.5

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the publicis_c3b_ind dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0014

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 48
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss
0.0436 0.1125 50 0.0401
0.0433 0.2249 100 0.0318
0.0326 0.3374 150 0.0277
0.0297 0.4498 200 0.0248
0.0318 0.5623 250 0.0222
0.0171 0.6747 300 0.0201
0.0313 0.7872 350 0.0188
0.0216 0.8996 400 0.0179
0.0157 1.0124 450 0.0164
0.0222 1.1248 500 0.0157
0.028 1.2373 550 0.0152
0.0152 1.3497 600 0.0141
0.0253 1.4622 650 0.0134
0.0196 1.5746 700 0.0131
0.0253 1.6871 750 0.0123
0.0127 1.7996 800 0.0116
0.0095 1.9120 850 0.0110
0.0209 2.0247 900 0.0102
0.0061 2.1372 950 0.0101
0.0111 2.2496 1000 0.0092
0.0095 2.3621 1050 0.0082
0.0066 2.4746 1100 0.0079
0.0117 2.5870 1150 0.0070
0.0041 2.6995 1200 0.0073
0.0094 2.8119 1250 0.0065
0.006 2.9244 1300 0.0061
0.0052 3.0371 1350 0.0057
0.0049 3.1496 1400 0.0053
0.0063 3.2620 1450 0.0039
0.0049 3.3745 1500 0.0039
0.0065 3.4869 1550 0.0037
0.0041 3.5994 1600 0.0034
0.0038 3.7118 1650 0.0033
0.0036 3.8243 1700 0.0033
0.0051 3.9367 1750 0.0031
0.0026 4.0495 1800 0.0027
0.002 4.1619 1850 0.0026
0.0024 4.2744 1900 0.0024
0.0023 4.3868 1950 0.0024
0.0034 4.4993 2000 0.0021
0.0019 4.6118 2050 0.0022
0.0017 4.7242 2100 0.0019
0.0017 4.8367 2150 0.0019
0.0025 4.9491 2200 0.0019
0.0018 5.0618 2250 0.0020
0.0016 5.1743 2300 0.0019
0.0014 5.2868 2350 0.0018
0.0014 5.3992 2400 0.0018
0.0012 5.5117 2450 0.0017
0.0011 5.6241 2500 0.0017
0.0008 5.7366 2550 0.0014
0.0018 5.8490 2600 0.0014
0.0017 5.9615 2650 0.0014
0.0009 6.0742 2700 0.0015
0.0009 6.1867 2750 0.0014
0.0014 6.2991 2800 0.0014
0.0012 6.4116 2850 0.0016

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.4.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
6
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.