models

This model is trained from scratch based on gpt2 on a dataset that includes 40% artificial variation sets. It achieves the following results on the evaluation set:

Loss: 3.4132
Accuracy: 0.1055

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 64
eval_batch_size: 256
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
5.4919	0.0221	100	4.9232	0.0666
4.381	0.0442	200	4.5587	0.0778
4.1011	0.0663	300	4.3706	0.0836
3.9359	0.0884	400	4.2434	0.0910
3.8161	0.1105	500	4.1663	0.0884
3.713	0.1326	600	4.0792	0.0939
3.6528	0.1547	700	4.0379	0.0925
3.5841	0.1768	800	3.9787	0.0936
3.5107	0.1989	900	3.9410	0.0946
3.4819	0.2210	1000	3.9099	0.0937
3.4388	0.2431	1100	3.8965	0.0940
3.4286	0.2653	1200	3.8627	0.0947
3.39	0.2874	1300	3.8378	0.0951
3.3659	0.3095	1400	3.8112	0.0960
3.3106	0.3316	1500	3.7943	0.0961
3.289	0.3537	1600	3.7917	0.0963
3.2774	0.3758	1700	3.7344	0.0981
3.2522	0.3979	1800	3.7512	0.0966
3.2242	0.4200	1900	3.7253	0.0980
3.23	0.4421	2000	3.7178	0.0977
3.193	0.4642	2100	3.6704	0.1013
3.1785	0.4863	2200	3.6979	0.0978
3.1548	0.5084	2300	3.6605	0.0998
3.1462	0.5305	2400	3.6843	0.0993
3.1432	0.5526	2500	3.6521	0.0995
3.1122	0.5747	2600	3.6481	0.0992
3.099	0.5968	2700	3.6302	0.1003
3.0936	0.6189	2800	3.6259	0.1008
3.1073	0.6410	2900	3.6341	0.0999
3.0484	0.6631	3000	3.6255	0.0998
3.0754	0.6852	3100	3.6538	0.1006
3.0563	0.7073	3200	3.5784	0.1017
3.0552	0.7294	3300	3.6309	0.1007
3.042	0.7515	3400	3.6018	0.1011
3.0203	0.7737	3500	3.5722	0.1010
3.0342	0.7958	3600	3.6028	0.1007
3.0306	0.8179	3700	3.5744	0.1017
3.0146	0.8400	3800	3.5778	0.1020
2.9996	0.8621	3900	3.5687	0.1015
3.0084	0.8842	4000	3.5571	0.1021
3.0052	0.9063	4100	3.5482	0.1023
2.9913	0.9284	4200	3.5543	0.1021
2.9684	0.9505	4300	3.5561	0.1022
2.9816	0.9726	4400	3.5141	0.1026
2.9628	0.9947	4500	3.5097	0.1031
2.9465	1.0168	4600	3.5310	0.1024
2.9349	1.0389	4700	3.5224	0.1033
2.9144	1.0610	4800	3.5388	0.1031
2.9476	1.0831	4900	3.5327	0.1033
2.9228	1.1052	5000	3.5370	0.1032
2.9122	1.1273	5100	3.5189	0.1033
2.9151	1.1494	5200	3.5119	0.1037
2.907	1.1715	5300	3.5090	0.1032
2.9189	1.1936	5400	3.5097	0.1037
2.9065	1.2157	5500	3.5006	0.1038
2.9075	1.2378	5600	3.4733	0.1042
2.8725	1.2599	5700	3.4937	0.1040
2.884	1.2821	5800	3.4992	0.1036
2.918	1.3042	5900	3.4763	0.1040
2.8647	1.3263	6000	3.5051	0.1041
2.8706	1.3484	6100	3.4771	0.1040
2.881	1.3705	6200	3.5170	0.1039
2.8788	1.3926	6300	3.5088	0.1040
2.8865	1.4147	6400	3.4944	0.1040
2.8605	1.4368	6500	3.5082	0.1042
2.8764	1.4589	6600	3.4666	0.1041
2.8828	1.4810	6700	3.5027	0.1041
2.8522	1.5031	6800	3.4695	0.1044
2.8674	1.5252	6900	3.4941	0.1041
2.8239	1.5473	7000	3.4779	0.1043
2.8633	1.5694	7100	3.5005	0.1046
2.8383	1.5915	7200	3.5013	0.1046
2.8555	1.6136	7300	3.4846	0.1046
2.8497	1.6357	7400	3.4165	0.1071
2.857	1.6578	7500	3.4531	0.1054
2.8239	1.6799	7600	3.4938	0.1048
2.8145	1.7020	7700	3.4814	0.1050
2.8429	1.7241	7800	3.4734	0.1043
2.8146	1.7462	7900	3.4483	0.1048
2.8285	1.7683	8000	3.4382	0.1051
2.8254	1.7905	8100	3.4824	0.1049
2.8318	1.8126	8200	3.4698	0.1053
2.8299	1.8347	8300	3.4737	0.1045
2.8332	1.8568	8400	3.4688	0.1051
2.8274	1.8789	8500	3.4308	0.1054
2.8171	1.9010	8600	3.4647	0.1053
2.8355	1.9231	8700	3.4586	0.1047
2.8031	1.9452	8800	3.4529	0.1049
2.8234	1.9673	8900	3.4379	0.1053
2.8097	1.9894	9000	3.4536	0.1055
2.7828	2.0115	9100	3.4409	0.1055
2.8027	2.0336	9200	3.4506	0.1055
2.7836	2.0557	9300	3.4617	0.1053
2.7874	2.0778	9400	3.4509	0.1050
2.7894	2.0999	9500	3.4132	0.1055
2.7863	2.1220	9600	3.4198	0.1055
2.7663	2.1441	9700	3.4524	0.1054
2.7846	2.1662	9800	3.4518	0.1056
2.7985	2.1883	9900	3.4453	0.1054
2.7947	2.2104	10000	3.4307	0.1056
2.7946	2.2325	10100	3.4598	0.1055
2.783	2.2546	10200	3.4523	0.1055
2.7763	2.2767	10300	3.4441	0.1056
2.7786	2.2989	10400	3.4659	0.1052
2.7672	2.3210	10500	3.4527	0.1053
2.767	2.3431	10600	3.4608	0.1053
2.7972	2.3652	10700	3.4277	0.1060
2.7958	2.3873	10800	3.4488	0.1053
2.774	2.4094	10900	3.4499	0.1056
2.7802	2.4315	11000	3.4281	0.1056
2.7576	2.4536	11100	3.4363	0.1058
2.76	2.4757	11200	3.4393	0.1059
2.7792	2.4978	11300	3.4389	0.1056
2.7804	2.5199	11400	3.4378	0.1060
2.7804	2.5420	11500	3.4236	0.1062
2.7835	2.5641	11600	3.4372	0.1060
2.7444	2.5862	11700	3.4518	0.1058
2.7636	2.6083	11800	3.4181	0.1060
2.7675	2.6304	11900	3.4290	0.1057
2.7487	2.6525	12000	3.4279	0.1058
2.7529	2.6746	12100	3.4300	0.1058
2.7819	2.6967	12200	3.4153	0.1062
2.7595	2.7188	12300	3.4477	0.1058
2.7585	2.7409	12400	3.4171	0.1059
2.7367	2.7630	12500	3.4297	0.1059
2.7701	2.7851	12600	3.4184	0.1058
2.7811	2.8073	12700	3.4334	0.1059
2.768	2.8294	12800	3.4295	0.1062
2.7715	2.8515	12900	3.4443	0.1058
2.7479	2.8736	13000	3.4344	0.1057
2.7479	2.8957	13100	3.4395	0.1059
2.7688	2.9178	13200	3.4270	0.1058
2.7708	2.9399	13300	3.4311	0.1059
2.7443	2.9620	13400	3.4314	0.1059
2.7428	2.9841	13500	3.4300	0.1059

Framework versions

Transformers 4.41.2
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1

akari000
/

gpt-2-artificial-vss-40

models

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for akari000/gpt-2-artificial-vss-40

Evaluation results