Built with Axolotl

f9654299-2910-4317-8a79-a3a3b4104211

This model is a fine-tuned version of katuni4ka/tiny-random-falcon-40b on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 9.9715

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.00021
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 100
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
No log 0.0008 1 11.0545
20.6914 0.0390 50 10.3889
19.7568 0.0781 100 10.1540
19.6019 0.1171 150 10.0781
19.5667 0.1562 200 10.0360
19.3438 0.1952 250 10.0107
19.4514 0.2343 300 9.9902
19.3682 0.2733 350 9.9803
19.5269 0.3124 400 9.9749
19.3169 0.3514 450 9.9752
19.3261 0.3905 500 9.9715

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
7
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for lesso10/f9654299-2910-4317-8a79-a3a3b4104211

Adapter
(304)
this model