Edit model card

Bitnet-Mistral.0.2-v5

This model is a fine-tuned version of liminerity/Bitnet-Mistral.0.2-v3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.4220

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
4.6207 0.0226 25 4.5769
4.3648 0.0451 50 4.5716
4.4694 0.0677 75 4.5629
4.1782 0.0903 100 4.5555
4.4231 0.1128 125 4.5486
4.2572 0.1354 150 4.5546
4.3837 0.1580 175 4.5466
4.1915 0.1805 200 4.5419
4.3894 0.2031 225 4.5257
4.2447 0.2257 250 4.5235
4.0355 0.2483 275 4.5236
4.1873 0.2708 300 4.5211
4.3891 0.2934 325 4.5078
4.1322 0.3160 350 4.5019
4.0357 0.3385 375 4.5051
4.3401 0.3611 400 4.4921
4.3848 0.3837 425 4.4903
4.305 0.4062 450 4.4789
4.2776 0.4288 475 4.4765
4.1802 0.4514 500 4.4727
4.0785 0.4739 525 4.4674
4.0607 0.4965 550 4.4623
3.9385 0.5191 575 4.4611
4.194 0.5416 600 4.4565
4.277 0.5642 625 4.4478
4.1751 0.5868 650 4.4457
4.0422 0.6093 675 4.4428
4.1503 0.6319 700 4.4406
4.0552 0.6545 725 4.4366
4.4017 0.6770 750 4.4327
4.2394 0.6996 775 4.4300
4.1975 0.7222 800 4.4277
4.2378 0.7448 825 4.4279
4.078 0.7673 850 4.4256
4.4727 0.7899 875 4.4235
4.1667 0.8125 900 4.4224
4.4079 0.8350 925 4.4223
4.3179 0.8576 950 4.4221
4.0479 0.8802 975 4.4220
4.0943 0.9027 1000 4.4220

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
11
Safetensors
Model size
32.1M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for liminerity/Bitnet-Mistral.0.2-v5

Finetuned
(1)
this model