best_model-sst-2-32-13

This model is a fine-tuned version of bert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2039
Accuracy: 0.8281

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	2	0.7856	0.8125
No log	2.0	4	0.7856	0.8125
No log	3.0	6	0.7857	0.8125
No log	4.0	8	0.7862	0.8125
0.5036	5.0	10	0.7867	0.8125
0.5036	6.0	12	0.7873	0.8125
0.5036	7.0	14	0.7883	0.8125
0.5036	8.0	16	0.7908	0.8125
0.5036	9.0	18	0.7955	0.8281
0.4185	10.0	20	0.8014	0.8125
0.4185	11.0	22	0.8066	0.8125
0.4185	12.0	24	0.8128	0.8125
0.4185	13.0	26	0.8208	0.8125
0.4185	14.0	28	0.8292	0.8125
0.2904	15.0	30	0.8390	0.8281
0.2904	16.0	32	0.8441	0.8125
0.2904	17.0	34	0.8451	0.8125
0.2904	18.0	36	0.8484	0.8125
0.2904	19.0	38	0.8510	0.8125
0.257	20.0	40	0.8506	0.8125
0.257	21.0	42	0.8471	0.8125
0.257	22.0	44	0.8397	0.8125
0.257	23.0	46	0.8311	0.8281
0.257	24.0	48	0.8248	0.8281
0.2216	25.0	50	0.8175	0.8281
0.2216	26.0	52	0.8108	0.8281
0.2216	27.0	54	0.8012	0.8281
0.2216	28.0	56	0.7907	0.8281
0.2216	29.0	58	0.7851	0.8281
0.1811	30.0	60	0.7800	0.8281
0.1811	31.0	62	0.7713	0.8281
0.1811	32.0	64	0.7620	0.8281
0.1811	33.0	66	0.7502	0.8125
0.1811	34.0	68	0.7386	0.8125
0.1015	35.0	70	0.7320	0.8125
0.1015	36.0	72	0.7296	0.8125
0.1015	37.0	74	0.7315	0.8125
0.1015	38.0	76	0.7371	0.8281
0.1015	39.0	78	0.7442	0.8281
0.0725	40.0	80	0.7475	0.8281
0.0725	41.0	82	0.7474	0.8281
0.0725	42.0	84	0.7479	0.8281
0.0725	43.0	86	0.7501	0.8281
0.0725	44.0	88	0.7523	0.8281
0.0249	45.0	90	0.7491	0.8281
0.0249	46.0	92	0.7537	0.8281
0.0249	47.0	94	0.7615	0.8281
0.0249	48.0	96	0.7767	0.8281
0.0249	49.0	98	0.7909	0.8281
0.0071	50.0	100	0.8011	0.8281
0.0071	51.0	102	0.8145	0.8281
0.0071	52.0	104	0.8286	0.8281
0.0071	53.0	106	0.8415	0.8281
0.0071	54.0	108	0.8451	0.8281
0.0057	55.0	110	0.8438	0.8281
0.0057	56.0	112	0.8368	0.8281
0.0057	57.0	114	0.8340	0.8281
0.0057	58.0	116	0.8431	0.8281
0.0057	59.0	118	0.8509	0.8281
0.0052	60.0	120	0.8579	0.8281
0.0052	61.0	122	0.8640	0.8281
0.0052	62.0	124	0.8691	0.8281
0.0052	63.0	126	0.8733	0.8281
0.0052	64.0	128	0.8767	0.8281
0.0027	65.0	130	0.8800	0.8281
0.0027	66.0	132	0.8826	0.8281
0.0027	67.0	134	0.8865	0.8281
0.0027	68.0	136	0.8929	0.8281
0.0027	69.0	138	0.9006	0.8281
0.0025	70.0	140	0.9079	0.8281
0.0025	71.0	142	0.9226	0.8281
0.0025	72.0	144	0.9417	0.8281
0.0025	73.0	146	0.9560	0.8281
0.0025	74.0	148	0.9663	0.8281
0.0028	75.0	150	0.9737	0.8125
0.0028	76.0	152	0.9761	0.8281
0.0028	77.0	154	0.9724	0.8281
0.0028	78.0	156	0.9675	0.8281
0.0028	79.0	158	0.9602	0.8281
0.0029	80.0	160	0.9534	0.8281
0.0029	81.0	162	0.9478	0.8281
0.0029	82.0	164	0.9437	0.8281
0.0029	83.0	166	0.9400	0.8281
0.0029	84.0	168	0.9366	0.8281
0.0016	85.0	170	0.9346	0.8281
0.0016	86.0	172	0.9343	0.8281
0.0016	87.0	174	0.9353	0.8281
0.0016	88.0	176	0.9367	0.8281
0.0016	89.0	178	0.9386	0.8281
0.0015	90.0	180	0.9413	0.8281
0.0015	91.0	182	0.9439	0.8281
0.0015	92.0	184	0.9472	0.8281
0.0015	93.0	186	0.9510	0.8281
0.0015	94.0	188	0.9552	0.8281
0.0013	95.0	190	0.9596	0.8281
0.0013	96.0	192	0.9641	0.8281
0.0013	97.0	194	0.9684	0.8281
0.0013	98.0	196	0.9725	0.8281
0.0013	99.0	198	0.9777	0.8281
0.0012	100.0	200	0.9881	0.8281
0.0012	101.0	202	0.9981	0.8281
0.0012	102.0	204	1.0066	0.8281
0.0012	103.0	206	1.0043	0.8281
0.0012	104.0	208	1.0029	0.8281
0.0011	105.0	210	1.0022	0.8281
0.0011	106.0	212	1.0017	0.8281
0.0011	107.0	214	1.0021	0.8281
0.0011	108.0	216	1.0029	0.8281
0.0011	109.0	218	1.0048	0.8281
0.001	110.0	220	1.0069	0.8281
0.001	111.0	222	1.0114	0.8281
0.001	112.0	224	1.0171	0.8281
0.001	113.0	226	1.0225	0.8281
0.001	114.0	228	1.0273	0.8281
0.0009	115.0	230	1.0325	0.8281
0.0009	116.0	232	1.0375	0.8281
0.0009	117.0	234	1.0419	0.8281
0.0009	118.0	236	1.0460	0.8281
0.0009	119.0	238	1.0500	0.8281
0.0009	120.0	240	1.0538	0.8281
0.0009	121.0	242	1.0572	0.8281
0.0009	122.0	244	1.0611	0.8281
0.0009	123.0	246	1.0650	0.8281
0.0009	124.0	248	1.0664	0.8281
0.0015	125.0	250	1.1047	0.8281
0.0015	126.0	252	1.1348	0.8281
0.0015	127.0	254	1.1568	0.8125
0.0015	128.0	256	1.1730	0.8125
0.0015	129.0	258	1.1849	0.8125
0.0007	130.0	260	1.1937	0.8125
0.0007	131.0	262	1.2006	0.8125
0.0007	132.0	264	1.2057	0.8125
0.0007	133.0	266	1.2096	0.8125
0.0007	134.0	268	1.2120	0.8125
0.0007	135.0	270	1.2140	0.8125
0.0007	136.0	272	1.2134	0.8125
0.0007	137.0	274	1.2122	0.8125
0.0007	138.0	276	1.2105	0.8125
0.0007	139.0	278	1.2089	0.8125
0.0006	140.0	280	1.2075	0.8125
0.0006	141.0	282	1.2063	0.8125
0.0006	142.0	284	1.2054	0.8125
0.0006	143.0	286	1.2049	0.8125
0.0006	144.0	288	1.2039	0.8125
0.0005	145.0	290	1.2032	0.8281
0.0005	146.0	292	1.2029	0.8281
0.0005	147.0	294	1.2028	0.8281
0.0005	148.0	296	1.2029	0.8281
0.0005	149.0	298	1.2032	0.8281
0.0005	150.0	300	1.2039	0.8281

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

best_model-sst-2-32-13

best_model-sst-2-32-13

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/best_model-sst-2-32-13

Evaluation results