metadata

tags:
  - generated_from_trainer
model-index:
  - name: baseline
    results: []

baseline

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.6338
Exact Match: 0.142

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Exact Match
2.6989	1.0	313	1.8586	0.0
1.834	2.0	626	1.5284	0.003
1.517	3.0	939	1.3632	0.005
1.2977	4.0	1252	1.2077	0.021
1.124	5.0	1565	1.1030	0.037
0.9885	6.0	1878	1.0607	0.05
0.8762	7.0	2191	1.0329	0.047
0.7698	8.0	2504	1.0087	0.063
0.6983	9.0	2817	0.9963	0.046
0.6297	10.0	3130	0.9754	0.076
0.5719	11.0	3443	0.9907	0.075
0.5247	12.0	3756	0.9777	0.069
0.4776	13.0	4069	0.9766	0.055
0.442	14.0	4382	0.9953	0.091
0.4081	15.0	4695	1.0005	0.098
0.3783	16.0	5008	1.0274	0.093
0.3545	17.0	5321	1.0516	0.087
0.3243	18.0	5634	1.0339	0.09
0.3045	19.0	5947	1.0419	0.078
0.2841	20.0	6260	1.0640	0.087
0.2692	21.0	6573	1.0839	0.105
0.2543	22.0	6886	1.1608	0.064
0.2346	23.0	7199	1.1046	0.113
0.2245	24.0	7512	1.1569	0.128
0.2135	25.0	7825	1.1242	0.108
0.2029	26.0	8138	1.1436	0.118
0.1902	27.0	8451	1.2023	0.095
0.1832	28.0	8764	1.1556	0.115
0.171	29.0	9077	1.2068	0.094
0.1639	30.0	9390	1.2101	0.151
0.1581	31.0	9703	1.2299	0.112
0.1504	32.0	10016	1.3153	0.1
0.1463	33.0	10329	1.2785	0.091
0.1405	34.0	10642	1.2662	0.111
0.1349	35.0	10955	1.2805	0.134
0.1291	36.0	11268	1.2516	0.137
0.126	37.0	11581	1.3312	0.141
0.1204	38.0	11894	1.2776	0.116
0.1163	39.0	12207	1.3203	0.11
0.114	40.0	12520	1.3212	0.129
0.1056	41.0	12833	1.3291	0.127
0.1033	42.0	13146	1.3010	0.125
0.1034	43.0	13459	1.3206	0.135
0.098	44.0	13772	1.3879	0.127
0.0951	45.0	14085	1.3693	0.111
0.089	46.0	14398	1.4261	0.124
0.0913	47.0	14711	1.3644	0.122
0.0863	48.0	15024	1.4392	0.108
0.0809	49.0	15337	1.3726	0.098
0.0795	50.0	15650	1.3791	0.084
0.0763	51.0	15963	1.3911	0.134
0.0768	52.0	16276	1.4202	0.104
0.076	53.0	16589	1.4594	0.122
0.0734	54.0	16902	1.4541	0.129
0.0714	55.0	17215	1.4032	0.133
0.0696	56.0	17528	1.4467	0.128
0.0674	57.0	17841	1.4952	0.103
0.0657	58.0	18154	1.4582	0.14
0.0658	59.0	18467	1.4619	0.121
0.061	60.0	18780	1.5447	0.111
0.0609	61.0	19093	1.4233	0.16
0.0596	62.0	19406	1.4705	0.134
0.058	63.0	19719	1.4721	0.144
0.0555	64.0	20032	1.4377	0.156
0.0532	65.0	20345	1.5016	0.125
0.0559	66.0	20658	1.5405	0.156
0.0517	67.0	20971	1.5166	0.133
0.0499	68.0	21284	1.4787	0.139
0.0477	69.0	21597	1.5063	0.124
0.0491	70.0	21910	1.5287	0.147
0.0464	71.0	22223	1.5428	0.131
0.0456	72.0	22536	1.5434	0.132
0.0449	73.0	22849	1.5364	0.116
0.0432	74.0	23162	1.5830	0.12
0.042	75.0	23475	1.5508	0.113
0.0403	76.0	23788	1.5146	0.134
0.0398	77.0	24101	1.5955	0.111
0.0412	78.0	24414	1.5759	0.132
0.0391	79.0	24727	1.5588	0.136
0.0383	80.0	25040	1.5580	0.141
0.0366	81.0	25353	1.5895	0.143
0.0365	82.0	25666	1.5637	0.148
0.035	83.0	25979	1.6012	0.155
0.0359	84.0	26292	1.6130	0.118
0.0343	85.0	26605	1.6038	0.158
0.0333	86.0	26918	1.6300	0.124
0.0318	87.0	27231	1.6259	0.145
0.0309	88.0	27544	1.6178	0.139
0.0303	89.0	27857	1.6166	0.143
0.0302	90.0	28170	1.6394	0.141
0.0293	91.0	28483	1.6408	0.154
0.0281	92.0	28796	1.6424	0.13
0.0288	93.0	29109	1.6426	0.136
0.0272	94.0	29422	1.6477	0.131
0.0278	95.0	29735	1.6288	0.142
0.0264	96.0	30048	1.6251	0.142
0.0268	97.0	30361	1.6340	0.142
0.0255	98.0	30674	1.6353	0.145
0.0263	99.0	30987	1.6333	0.143
0.0259	100.0	31300	1.6338	0.142

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu118
Datasets 2.15.0
Tokenizers 0.15.0