best_model-sst-2-64-13

This model is a fine-tuned version of bert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.3339
Accuracy: 0.8438

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	4	0.9854	0.8594
No log	2.0	8	0.9825	0.8594
0.3525	3.0	12	0.9791	0.8672
0.3525	4.0	16	0.9752	0.8672
0.2499	5.0	20	0.9700	0.8672
0.2499	6.0	24	0.9629	0.8594
0.2499	7.0	28	0.9588	0.8594
0.2671	8.0	32	0.9589	0.8516
0.2671	9.0	36	0.9573	0.8516
0.1578	10.0	40	0.9487	0.8516
0.1578	11.0	44	0.9428	0.8516
0.1578	12.0	48	0.9323	0.8516
0.1695	13.0	52	0.9192	0.8594
0.1695	14.0	56	0.9121	0.8516
0.106	15.0	60	0.9055	0.8516
0.106	16.0	64	0.8947	0.8594
0.106	17.0	68	0.8886	0.875
0.1075	18.0	72	0.8914	0.8672
0.1075	19.0	76	0.8882	0.8672
0.0226	20.0	80	0.8871	0.8672
0.0226	21.0	84	0.8825	0.8594
0.0226	22.0	88	0.8841	0.8594
0.0045	23.0	92	0.8858	0.8672
0.0045	24.0	96	0.8902	0.875
0.0108	25.0	100	0.8941	0.8672
0.0108	26.0	104	0.8965	0.8672
0.0108	27.0	108	0.9028	0.8594
0.0242	28.0	112	0.9052	0.8594
0.0242	29.0	116	0.9104	0.8594
0.0004	30.0	120	0.9156	0.8594
0.0004	31.0	124	0.9166	0.8594
0.0004	32.0	128	0.9117	0.8594
0.0004	33.0	132	0.9111	0.8594
0.0004	34.0	136	0.9245	0.875
0.0011	35.0	140	0.9451	0.8594
0.0011	36.0	144	0.9664	0.8516
0.0011	37.0	148	0.9794	0.8359
0.0002	38.0	152	0.9838	0.8359
0.0002	39.0	156	0.9680	0.8594
0.0003	40.0	160	0.9540	0.8516
0.0003	41.0	164	0.9479	0.8672
0.0003	42.0	168	0.9734	0.8516
0.0003	43.0	172	0.9954	0.8516
0.0003	44.0	176	1.0139	0.8594
0.0002	45.0	180	1.0285	0.8516
0.0002	46.0	184	1.0383	0.8359
0.0002	47.0	188	1.0443	0.8359
0.0002	48.0	192	1.0474	0.8359
0.0002	49.0	196	1.0490	0.8359
0.0004	50.0	200	1.0141	0.8516
0.0004	51.0	204	0.9861	0.8672
0.0004	52.0	208	0.9913	0.8672
0.0204	53.0	212	1.0418	0.8594
0.0204	54.0	216	1.0818	0.8438
0.0002	55.0	220	1.1084	0.8359
0.0002	56.0	224	1.1198	0.8438
0.0002	57.0	228	1.1048	0.8359
0.0002	58.0	232	1.0871	0.8516
0.0002	59.0	236	1.0756	0.8516
0.0002	60.0	240	1.0676	0.8516
0.0002	61.0	244	1.0631	0.8516
0.0002	62.0	248	1.0605	0.8516
0.0001	63.0	252	1.0594	0.8594
0.0001	64.0	256	1.0592	0.8516
0.0001	65.0	260	1.0597	0.8594
0.0001	66.0	264	1.0594	0.8594
0.0001	67.0	268	1.0597	0.8516
0.0001	68.0	272	1.0606	0.8516
0.0001	69.0	276	1.0794	0.8516
0.0003	70.0	280	1.1418	0.8438
0.0003	71.0	284	1.1868	0.8516
0.0003	72.0	288	1.2120	0.8516
0.0001	73.0	292	1.2064	0.8516
0.0001	74.0	296	1.1566	0.8438
0.0002	75.0	300	1.1006	0.8516
0.0002	76.0	304	1.0705	0.8516
0.0002	77.0	308	1.0654	0.8516
0.0001	78.0	312	1.0651	0.8594
0.0001	79.0	316	1.0659	0.8594
0.0001	80.0	320	1.0674	0.8516
0.0001	81.0	324	1.0691	0.8516
0.0001	82.0	328	1.0786	0.8516
0.0001	83.0	332	1.0875	0.8516
0.0001	84.0	336	1.0948	0.8438
0.0001	85.0	340	1.1004	0.8438
0.0001	86.0	344	1.1058	0.8438
0.0001	87.0	348	1.1103	0.8438
0.0001	88.0	352	1.1136	0.8438
0.0001	89.0	356	1.1162	0.8438
0.0001	90.0	360	1.1180	0.8438
0.0001	91.0	364	1.1119	0.8438
0.0001	92.0	368	1.1084	0.8438
0.0001	93.0	372	1.1066	0.8516
0.0001	94.0	376	1.1059	0.8516
0.0001	95.0	380	1.1059	0.8516
0.0001	96.0	384	1.1065	0.8516
0.0001	97.0	388	1.1084	0.8516
0.0064	98.0	392	1.1955	0.8438
0.0064	99.0	396	1.2544	0.8516
0.0001	100.0	400	1.3053	0.8359
0.0001	101.0	404	1.3606	0.8281
0.0001	102.0	408	1.3399	0.8281
0.0068	103.0	412	1.2648	0.8516
0.0068	104.0	416	1.1161	0.8516
0.0001	105.0	420	1.0830	0.8594
0.0001	106.0	424	1.1095	0.8672
0.0001	107.0	428	1.0817	0.8672
0.0139	108.0	432	1.1057	0.8516
0.0139	109.0	436	1.1392	0.8438
0.0001	110.0	440	1.1623	0.8438
0.0001	111.0	444	1.1707	0.8438
0.0001	112.0	448	1.1766	0.8438
0.0001	113.0	452	1.1808	0.8516
0.0001	114.0	456	1.1826	0.8516
0.0001	115.0	460	1.1809	0.8438
0.0001	116.0	464	1.1380	0.8438
0.0001	117.0	468	1.1289	0.8594
0.0001	118.0	472	1.1853	0.8594
0.0001	119.0	476	1.2030	0.8594
0.0001	120.0	480	1.1913	0.8594
0.0001	121.0	484	1.1660	0.8672
0.0001	122.0	488	1.1591	0.8594
0.0001	123.0	492	1.1678	0.8438
0.0001	124.0	496	1.1800	0.8516
0.0001	125.0	500	1.1896	0.8516
0.0001	126.0	504	1.1972	0.8516
0.0001	127.0	508	1.2034	0.8516
0.0001	128.0	512	1.2074	0.8438
0.0001	129.0	516	1.2104	0.8438
0.0	130.0	520	1.2126	0.8438
0.0	131.0	524	1.1920	0.8672
0.0	132.0	528	1.2214	0.8516
0.0007	133.0	532	1.2321	0.8516
0.0007	134.0	536	1.2382	0.8516
0.0001	135.0	540	1.2297	0.8516
0.0001	136.0	544	1.1786	0.8516
0.0001	137.0	548	1.2126	0.8516
0.0001	138.0	552	1.2706	0.8516
0.0001	139.0	556	1.2978	0.8516
0.0	140.0	560	1.3119	0.8516
0.0	141.0	564	1.3222	0.8438
0.0	142.0	568	1.3290	0.8438
0.0	143.0	572	1.3333	0.8438
0.0	144.0	576	1.3357	0.8438
0.0	145.0	580	1.3371	0.8438
0.0	146.0	584	1.3371	0.8438
0.0	147.0	588	1.3353	0.8438
0.0001	148.0	592	1.3344	0.8438
0.0001	149.0	596	1.3340	0.8438
0.0	150.0	600	1.3339	0.8438

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

best_model-sst-2-64-13

best_model-sst-2-64-13

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/best_model-sst-2-64-13

Evaluation results