nova_v1.5

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the publicis_c3b_ind dataset. It achieves the following results on the evaluation set:

Loss: 0.0014

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 48
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 25

Training results

Training Loss	Epoch	Step	Validation Loss
0.0436	0.1125	50	0.0401
0.0433	0.2249	100	0.0318
0.0326	0.3374	150	0.0277
0.0297	0.4498	200	0.0248
0.0318	0.5623	250	0.0222
0.0171	0.6747	300	0.0201
0.0313	0.7872	350	0.0188
0.0216	0.8996	400	0.0179
0.0157	1.0124	450	0.0164
0.0222	1.1248	500	0.0157
0.028	1.2373	550	0.0152
0.0152	1.3497	600	0.0141
0.0253	1.4622	650	0.0134
0.0196	1.5746	700	0.0131
0.0253	1.6871	750	0.0123
0.0127	1.7996	800	0.0116
0.0095	1.9120	850	0.0110
0.0209	2.0247	900	0.0102
0.0061	2.1372	950	0.0101
0.0111	2.2496	1000	0.0092
0.0095	2.3621	1050	0.0082
0.0066	2.4746	1100	0.0079
0.0117	2.5870	1150	0.0070
0.0041	2.6995	1200	0.0073
0.0094	2.8119	1250	0.0065
0.006	2.9244	1300	0.0061
0.0052	3.0371	1350	0.0057
0.0049	3.1496	1400	0.0053
0.0063	3.2620	1450	0.0039
0.0049	3.3745	1500	0.0039
0.0065	3.4869	1550	0.0037
0.0041	3.5994	1600	0.0034
0.0038	3.7118	1650	0.0033
0.0036	3.8243	1700	0.0033
0.0051	3.9367	1750	0.0031
0.0026	4.0495	1800	0.0027
0.002	4.1619	1850	0.0026
0.0024	4.2744	1900	0.0024
0.0023	4.3868	1950	0.0024
0.0034	4.4993	2000	0.0021
0.0019	4.6118	2050	0.0022
0.0017	4.7242	2100	0.0019
0.0017	4.8367	2150	0.0019
0.0025	4.9491	2200	0.0019
0.0018	5.0618	2250	0.0020
0.0016	5.1743	2300	0.0019
0.0014	5.2868	2350	0.0018
0.0014	5.3992	2400	0.0018
0.0012	5.5117	2450	0.0017
0.0011	5.6241	2500	0.0017
0.0008	5.7366	2550	0.0014
0.0018	5.8490	2600	0.0014
0.0017	5.9615	2650	0.0014
0.0009	6.0742	2700	0.0015
0.0009	6.1867	2750	0.0014
0.0014	6.2991	2800	0.0014
0.0012	6.4116	2850	0.0016

Framework versions

PEFT 0.12.0
Transformers 4.46.1
Pytorch 2.4.0+cu121
Datasets 3.1.0
Tokenizers 0.20.3

sizhkhy
/

publicis_c3b_ind

nova_v1.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results