Training complete

a625d0a verified 6 months ago

5.34 kB

	---
	license: apache-2.0
	base_model: DmitryPogrebnoy/MedRuRobertaLarge
	tags:
	- generated_from_trainer
	metrics:
	- precision
	- recall
	- f1
	- accuracy
	model-index:
	- name: MedRuRobertaLarge_pos
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# MedRuRobertaLarge_pos

	This model is a fine-tuned version of [DmitryPogrebnoy/MedRuRobertaLarge](https://huggingface.co./DmitryPogrebnoy/MedRuRobertaLarge) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.4867
	- Precision: 0.5088
	- Recall: 0.5257
	- F1: 0.5171
	- Accuracy: 0.8997

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 4
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 100

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Precision \| Recall \| F1 \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:---------:\|:------:\|:------:\|:--------:\|
	\| No log \| 1.0 \| 50 \| 0.6663 \| 0.0 \| 0.0 \| 0.0 \| 0.7639 \|
	\| No log \| 2.0 \| 100 \| 0.5206 \| 0.0178 \| 0.0154 \| 0.0165 \| 0.8015 \|
	\| No log \| 3.0 \| 150 \| 0.4083 \| 0.0409 \| 0.0617 \| 0.0492 \| 0.8346 \|
	\| No log \| 4.0 \| 200 \| 0.3900 \| 0.1300 \| 0.2139 \| 0.1617 \| 0.8368 \|
	\| No log \| 5.0 \| 250 \| 0.3372 \| 0.1893 \| 0.2987 \| 0.2317 \| 0.8598 \|
	\| No log \| 6.0 \| 300 \| 0.2828 \| 0.2713 \| 0.3622 \| 0.3102 \| 0.8907 \|
	\| No log \| 7.0 \| 350 \| 0.3583 \| 0.3625 \| 0.4066 \| 0.3833 \| 0.8890 \|
	\| No log \| 8.0 \| 400 \| 0.2786 \| 0.3638 \| 0.4605 \| 0.4065 \| 0.8995 \|
	\| No log \| 9.0 \| 450 \| 0.3000 \| 0.3224 \| 0.4181 \| 0.3641 \| 0.8981 \|
	\| 0.3576 \| 10.0 \| 500 \| 0.3055 \| 0.4872 \| 0.5145 \| 0.5005 \| 0.9085 \|
	\| 0.3576 \| 11.0 \| 550 \| 0.2949 \| 0.4633 \| 0.5106 \| 0.4858 \| 0.9123 \|
	\| 0.3576 \| 12.0 \| 600 \| 0.3481 \| 0.4407 \| 0.5723 \| 0.4979 \| 0.9054 \|
	\| 0.3576 \| 13.0 \| 650 \| 0.3636 \| 0.4814 \| 0.5241 \| 0.5018 \| 0.9054 \|
	\| 0.3576 \| 14.0 \| 700 \| 0.3186 \| 0.4981 \| 0.5010 \| 0.4995 \| 0.9132 \|
	\| 0.3576 \| 15.0 \| 750 \| 0.3472 \| 0.4329 \| 0.5780 \| 0.4950 \| 0.9084 \|
	\| 0.3576 \| 16.0 \| 800 \| 0.3664 \| 0.4843 \| 0.5665 \| 0.5222 \| 0.9177 \|
	\| 0.3576 \| 17.0 \| 850 \| 0.3666 \| 0.4371 \| 0.6089 \| 0.5089 \| 0.9085 \|
	\| 0.3576 \| 18.0 \| 900 \| 0.4685 \| 0.4894 \| 0.5356 \| 0.5115 \| 0.9167 \|
	\| 0.3576 \| 19.0 \| 950 \| 0.3722 \| 0.4309 \| 0.5703 \| 0.4909 \| 0.9154 \|
	\| 0.0824 \| 20.0 \| 1000 \| 0.3861 \| 0.5327 \| 0.5645 \| 0.5482 \| 0.9097 \|
	\| 0.0824 \| 21.0 \| 1050 \| 0.6866 \| 0.5201 \| 0.4239 \| 0.4671 \| 0.8853 \|
	\| 0.0824 \| 22.0 \| 1100 \| 0.5474 \| 0.4616 \| 0.6493 \| 0.5396 \| 0.8934 \|
	\| 0.0824 \| 23.0 \| 1150 \| 0.4203 \| 0.5714 \| 0.5857 \| 0.5785 \| 0.9168 \|
	\| 0.0824 \| 24.0 \| 1200 \| 0.4038 \| 0.3748 \| 0.5568 \| 0.4481 \| 0.8989 \|
	\| 0.0824 \| 25.0 \| 1250 \| 0.4873 \| 0.5564 \| 0.5414 \| 0.5488 \| 0.9123 \|
	\| 0.0824 \| 26.0 \| 1300 \| 0.4516 \| 0.5306 \| 0.5838 \| 0.5560 \| 0.9170 \|
	\| 0.0824 \| 27.0 \| 1350 \| 0.4349 \| 0.5738 \| 0.5915 \| 0.5825 \| 0.9110 \|
	\| 0.0824 \| 28.0 \| 1400 \| 0.4042 \| 0.5250 \| 0.5857 \| 0.5537 \| 0.9083 \|
	\| 0.0824 \| 29.0 \| 1450 \| 0.4187 \| 0.6107 \| 0.6166 \| 0.6136 \| 0.9103 \|
	\| 0.0475 \| 30.0 \| 1500 \| 0.3910 \| 0.4615 \| 0.6127 \| 0.5265 \| 0.9060 \|
	\| 0.0475 \| 31.0 \| 1550 \| 0.4171 \| 0.5541 \| 0.6416 \| 0.5946 \| 0.9133 \|
	\| 0.0475 \| 32.0 \| 1600 \| 0.4948 \| 0.5730 \| 0.6127 \| 0.5922 \| 0.9109 \|
	\| 0.0475 \| 33.0 \| 1650 \| 0.4637 \| 0.5048 \| 0.6089 \| 0.5520 \| 0.9118 \|
	\| 0.0475 \| 34.0 \| 1700 \| 0.3740 \| 0.5431 \| 0.6185 \| 0.5784 \| 0.9213 \|
	\| 0.0475 \| 35.0 \| 1750 \| 0.4047 \| 0.5280 \| 0.5992 \| 0.5614 \| 0.9129 \|
	\| 0.0475 \| 36.0 \| 1800 \| 0.4010 \| 0.5352 \| 0.6301 \| 0.5788 \| 0.9150 \|
	\| 0.0475 \| 37.0 \| 1850 \| 0.5743 \| 0.5905 \| 0.5530 \| 0.5711 \| 0.9108 \|
	\| 0.0475 \| 38.0 \| 1900 \| 0.4936 \| 0.5110 \| 0.4913 \| 0.5010 \| 0.9102 \|
	\| 0.0475 \| 39.0 \| 1950 \| 0.4450 \| 0.4537 \| 0.5857 \| 0.5114 \| 0.9119 \|
	\| 0.0424 \| 40.0 \| 2000 \| 0.4611 \| 0.4983 \| 0.5588 \| 0.5268 \| 0.9130 \|
	\| 0.0424 \| 41.0 \| 2050 \| 0.4748 \| 0.5199 \| 0.5279 \| 0.5239 \| 0.9075 \|
	\| 0.0424 \| 42.0 \| 2100 \| 0.5121 \| 0.5264 \| 0.5568 \| 0.5412 \| 0.9126 \|


	### Framework versions

	- Transformers 4.38.2
	- Pytorch 2.1.2
	- Datasets 2.1.0
	- Tokenizers 0.15.2