llm3br256

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the akash_unifo_757 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0032

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 25.0

Training results

Training Loss Epoch Step Validation Loss
0.0133 0.0808 25 0.0123
0.0056 0.1616 50 0.0098
0.0113 0.2424 75 0.0084
0.0083 0.3232 100 0.0074
0.0086 0.4040 125 0.0065
0.0067 0.4848 150 0.0060
0.0038 0.5657 175 0.0058
0.0043 0.6465 200 0.0054
0.004 0.7273 225 0.0053
0.0095 0.8081 250 0.0050
0.0071 0.8889 275 0.0050
0.0056 0.9697 300 0.0047
0.0017 1.0505 325 0.0048
0.0024 1.1313 350 0.0046
0.0023 1.2121 375 0.0046
0.003 1.2929 400 0.0043
0.005 1.3737 425 0.0044
0.0039 1.4545 450 0.0040
0.0045 1.5354 475 0.0041
0.0032 1.6162 500 0.0042
0.0021 1.6970 525 0.0039
0.0028 1.7778 550 0.0038
0.0072 1.8586 575 0.0037
0.0091 1.9394 600 0.0041
0.0021 2.0202 625 0.0039
0.0026 2.1010 650 0.0040
0.0028 2.1818 675 0.0038
0.0027 2.2626 700 0.0038
0.0032 2.3434 725 0.0038
0.0022 2.4242 750 0.0038
0.0024 2.5051 775 0.0036
0.0015 2.5859 800 0.0034
0.0022 2.6667 825 0.0036
0.0045 2.7475 850 0.0034
0.004 2.8283 875 0.0035
0.0026 2.9091 900 0.0034
0.0019 2.9899 925 0.0033
0.0015 3.0707 950 0.0036
0.0018 3.1515 975 0.0034
0.0013 3.2323 1000 0.0036
0.0019 3.3131 1025 0.0034
0.0012 3.3939 1050 0.0033
0.0018 3.4747 1075 0.0034
0.0015 3.5556 1100 0.0034
0.0012 3.6364 1125 0.0034
0.0018 3.7172 1150 0.0032
0.002 3.7980 1175 0.0032
0.002 3.8788 1200 0.0032
0.0017 3.9596 1225 0.0032
0.0009 4.0404 1250 0.0033
0.0013 4.1212 1275 0.0033
0.0023 4.2020 1300 0.0033
0.0026 4.2828 1325 0.0031
0.0016 4.3636 1350 0.0031
0.0023 4.4444 1375 0.0031
0.0015 4.5253 1400 0.0033
0.0011 4.6061 1425 0.0031
0.0016 4.6869 1450 0.0032
0.0011 4.7677 1475 0.0033
0.0007 4.8485 1500 0.0032
0.0014 4.9293 1525 0.0033
0.0009 5.0101 1550 0.0033

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.4.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
10
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for sizhkhy/akash_unifo_757

Adapter
(252)
this model