Llama-Instruct-8B

This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2981

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 4
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
2.0291	0.1144	50	1.8063
1.3006	0.2288	100	0.6497
0.471	0.3432	150	0.4071
0.3923	0.4577	200	0.3856
0.3784	0.5721	250	0.3746
0.3671	0.6865	300	0.3592
0.3515	0.8009	350	0.3436
0.3334	0.9153	400	0.3328
0.3292	1.0297	450	0.3275
0.3249	1.1442	500	0.3237
0.3213	1.2586	550	0.3215
0.3177	1.3730	600	0.3180
0.3152	1.4874	650	0.3171
0.3141	1.6018	700	0.3142
0.3108	1.7162	750	0.3130
0.3124	1.8307	800	0.3120
0.3112	1.9451	850	0.3104
0.3091	2.0595	900	0.3088
0.3077	2.1739	950	0.3079
0.304	2.2883	1000	0.3065
0.3052	2.4027	1050	0.3054
0.3017	2.5172	1100	0.3046
0.3018	2.6316	1150	0.3039
0.3019	2.7460	1200	0.3030
0.3017	2.8604	1250	0.3021
0.3005	2.9748	1300	0.3017
0.2989	3.0892	1350	0.3009
0.299	3.2037	1400	0.3007
0.2989	3.3181	1450	0.2999
0.2978	3.4325	1500	0.2995
0.2957	3.5469	1550	0.2993
0.2969	3.6613	1600	0.2989
0.2961	3.7757	1650	0.2983
0.2932	3.8902	1700	0.2981

Framework versions

PEFT 0.13.2
Transformers 4.44.2
Pytorch 2.5.0+cu121
Datasets 3.0.2
Tokenizers 0.19.1

Ahsan221
/

Llama-Instruct-8B

Llama-Instruct-8B

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Ahsan221/Llama-Instruct-8B

Evaluation results