scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cardiffnlp_tweet_sentiment_multilingual_all on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:

Loss: 98.0968
Accuracy: 0.4425
F1: 0.4324

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 32
seed: 66
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
208.3506	1.0875	500	154.6242	0.3353	0.2392
144.3425	2.1751	1000	136.8450	0.3592	0.2854
130.8787	3.2626	1500	127.5179	0.3449	0.2781
121.607	4.3502	2000	120.6836	0.3441	0.2003
114.9253	5.4377	2500	116.0772	0.3391	0.1845
110.2614	6.5253	3000	112.7433	0.3638	0.3057
106.3691	7.6128	3500	110.4269	0.3368	0.1864
103.447	8.7004	4000	108.3496	0.3457	0.2213
101.1211	9.7879	4500	106.8976	0.3515	0.2545
99.1339	10.8755	5000	105.7209	0.3526	0.2304
97.4903	11.9630	5500	104.6693	0.3611	0.2644
95.9649	13.0506	6000	104.2819	0.3762	0.3144
94.5941	14.1381	6500	103.7839	0.3522	0.2629
93.4504	15.2257	7000	103.1094	0.3634	0.2659
92.4911	16.3132	7500	102.6213	0.3708	0.2965
91.3827	17.4008	8000	101.7905	0.3553	0.2367
90.5461	18.4883	8500	101.6541	0.3920	0.3622
89.7497	19.5759	9000	101.0513	0.3893	0.3294
89.0327	20.6634	9500	100.9607	0.3819	0.3518
88.2506	21.7510	10000	100.4227	0.4024	0.3914
87.6247	22.8385	10500	100.3990	0.3688	0.2909
86.9539	23.9260	11000	100.0497	0.3916	0.3222
86.4517	25.0136	11500	100.0023	0.3816	0.3035
85.8318	26.1011	12000	99.8234	0.4086	0.3865
85.3791	27.1887	12500	99.6990	0.4066	0.3778
84.9131	28.2762	13000	99.4542	0.3947	0.3581
84.3986	29.3638	13500	99.2787	0.4151	0.4025
84.0077	30.4513	14000	99.1953	0.4140	0.3797
83.7463	31.5389	14500	99.0662	0.4225	0.3824
83.4825	32.6264	15000	99.0083	0.3954	0.3351
82.892	33.7140	15500	98.7523	0.4244	0.4113
82.6493	34.8015	16000	98.6114	0.4090	0.3895
82.4001	35.8891	16500	98.6116	0.4213	0.3906
82.1674	36.9766	17000	98.5904	0.4348	0.4099
81.9281	38.0642	17500	98.3501	0.4128	0.3718
81.6301	39.1517	18000	98.3880	0.4294	0.4104
81.4927	40.2393	18500	98.2517	0.4279	0.4222
81.3458	41.3268	19000	98.1944	0.4317	0.4149
81.1125	42.4144	19500	98.1906	0.4155	0.3674
80.9588	43.5019	20000	98.2918	0.4336	0.4209
80.907	44.5895	20500	98.1503	0.4317	0.4038
80.8419	45.6770	21000	98.1176	0.4221	0.4068
80.5292	46.7645	21500	98.1451	0.4437	0.4328
80.5741	47.8521	22000	98.0707	0.4344	0.4210
80.5482	48.9396	22500	98.0968	0.4425	0.4324

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

Evaluation results