Edit model card

Visualize in Weights & Biases

pythia-70m_tatsu-lab_alpaca_farm_sftsd0_policy_pythia-6.9b_gold_internlm2-7b_noise0.25_rmsd4

This model is a fine-tuned version of RylanSchaeffer/EleutherAI_pythia-70m_tatsu-lab_alpaca_farm_sftseed0 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7699
  • Accuracy: 0.5050

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.025
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0 0 0.8960 0.4803
0.9577 0.0648 100 0.8893 0.4823
0.81 0.1296 200 0.8674 0.4904
0.7537 0.1944 300 0.8427 0.4981
0.8347 0.2592 400 0.8380 0.4946
0.7976 0.3239 500 0.8229 0.4946
0.7834 0.3887 600 0.8168 0.4946
0.8027 0.4535 700 0.8099 0.4869
0.8188 0.5183 800 0.8065 0.4865
0.7637 0.5831 900 0.7988 0.4950
0.7926 0.6479 1000 0.7960 0.5042
0.7726 0.7127 1100 0.7893 0.5035
0.7945 0.7775 1200 0.7892 0.5019
0.81 0.8422 1300 0.7858 0.5058
0.7574 0.9070 1400 0.7858 0.4973
0.7841 0.9718 1500 0.7824 0.5093
0.8161 1.0366 1600 0.7836 0.5035
0.8016 1.1014 1700 0.7855 0.4992
0.7499 1.1662 1800 0.7831 0.4977
0.7906 1.2310 1900 0.7786 0.4985
0.7698 1.2958 2000 0.7800 0.4973
0.7582 1.3605 2100 0.7817 0.4977
0.7981 1.4253 2200 0.7840 0.4996
0.8067 1.4901 2300 0.7814 0.5008
0.7667 1.5549 2400 0.7764 0.5031
0.7847 1.6197 2500 0.7810 0.5004
0.7858 1.6845 2600 0.7790 0.5008
0.74 1.7493 2700 0.7762 0.5012
0.7837 1.8141 2800 0.7784 0.5023
0.7615 1.8788 2900 0.7793 0.5008
0.7623 1.9436 3000 0.7735 0.4992
0.7823 2.0084 3100 0.7762 0.4954
0.7797 2.0732 3200 0.7762 0.5012
0.7497 2.1380 3300 0.7728 0.5027
0.7806 2.2028 3400 0.7739 0.4973
0.7525 2.2676 3500 0.7724 0.5035
0.7927 2.3324 3600 0.7731 0.5027
0.8046 2.3971 3700 0.7749 0.5073
0.7185 2.4619 3800 0.7744 0.5089
0.7616 2.5267 3900 0.7752 0.4942
0.7214 2.5915 4000 0.7733 0.5004
0.7663 2.6563 4100 0.7715 0.4961
0.7572 2.7211 4200 0.7735 0.4985
0.7258 2.7859 4300 0.7739 0.4988
0.7932 2.8507 4400 0.7738 0.5023
0.7513 2.9155 4500 0.7739 0.5058
0.7583 2.9802 4600 0.7748 0.4973
0.7102 3.0450 4700 0.7762 0.5050
0.7628 3.1098 4800 0.7716 0.5023
0.7901 3.1746 4900 0.7751 0.5081
0.77 3.2394 5000 0.7746 0.5023
0.7504 3.3042 5100 0.7721 0.5023
0.7538 3.3690 5200 0.7732 0.5027
0.7029 3.4338 5300 0.7738 0.4950
0.7198 3.4985 5400 0.7716 0.5054
0.7726 3.5633 5500 0.7683 0.5050
0.7792 3.6281 5600 0.7746 0.4923
0.7268 3.6929 5700 0.7750 0.5008
0.7532 3.7577 5800 0.7722 0.5046
0.766 3.8225 5900 0.7715 0.5015
0.7876 3.8873 6000 0.7760 0.4938
0.8172 3.9521 6100 0.7728 0.4988
0.7625 4.0168 6200 0.7762 0.5
0.7819 4.0816 6300 0.7766 0.5042
0.7582 4.1464 6400 0.7733 0.5042
0.79 4.2112 6500 0.7715 0.5027
0.7344 4.2760 6600 0.7693 0.5012
0.8079 4.3408 6700 0.7730 0.5066
0.7391 4.4056 6800 0.7745 0.4992
0.763 4.4704 6900 0.7733 0.5039
0.7363 4.5351 7000 0.7727 0.5031
0.7584 4.5999 7100 0.7723 0.5015
0.7587 4.6647 7200 0.7707 0.4934
0.7168 4.7295 7300 0.7711 0.5081
0.7479 4.7943 7400 0.7718 0.4981
0.7739 4.8591 7500 0.7703 0.5073
0.7814 4.9239 7600 0.7719 0.5004
0.7487 4.9887 7700 0.7695 0.5054

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
6
Safetensors
Model size
44.7M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for RylanSchaeffer/pythia-70m_tatsu-lab_alpaca_farm_sftsd0_policy_pythia-6.9b_gold_internlm2-7b_noise0.25_rmsd4