dfurman
/

deberta-v3-large-imdb-v0.2

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

deberta-v3-large-imdb-v0.2 / README.md

dfurman's picture

Update README.md

976b950 verified 8 months ago

|

2.49 kB

	---
	license: mit
	base_model: microsoft/deberta-v3-large
	datasets:
	- imdb
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	model-index:
	- name: deberta-v3-large-imdb-v0.2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# deberta-v3-large-imdb-v0.2

	This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co./microsoft/deberta-v3-large) on an unknown dataset.
	It achieves the following results on the evaluation set @ epoch 9 of 10, which is loaded as the best model here:
	- Accuracy: 0.9656
	- F1: 0.9657
	- Precision: 0.9640
	- Recall: 0.9673

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.2
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 \| Precision \| Recall \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:--------:\|:------:\|:---------:\|:------:\|
	\| 0.2279 \| 1.0 \| 3125 \| 0.1466 \| 0.9603 \| 0.9599 \| 0.9693 \| 0.9506 \|
	\| 0.2689 \| 2.0 \| 6250 \| 0.1929 \| 0.9550 \| 0.9546 \| 0.9626 \| 0.9467 \|
	\| 0.1728 \| 3.0 \| 9375 \| 0.1807 \| 0.9584 \| 0.9579 \| 0.9697 \| 0.9463 \|
	\| 0.1937 \| 4.0 \| 12500 \| 0.1734 \| 0.9435 \| 0.9457 \| 0.9102 \| 0.9841 \|
	\| 0.2044 \| 5.0 \| 15625 \| 0.2102 \| 0.9510 \| 0.9523 \| 0.9272 \| 0.9788 \|
	\| 0.0484 \| 6.0 \| 18750 \| 0.2134 \| 0.9593 \| 0.9599 \| 0.9448 \| 0.9756 \|
	\| 0.0336 \| 7.0 \| 21875 \| 0.2278 \| 0.9610 \| 0.9614 \| 0.9524 \| 0.9706 \|
	\| 0.0704 \| 8.0 \| 25000 \| 0.2039 \| 0.9648 \| 0.9651 \| 0.9581 \| 0.9721 \|
	\| 0.0004 \| 9.0 \| 28125 \| 0.2241 \| 0.9656 \| 0.9657 \| 0.9640 \| 0.9673 \|
	\| 0.0004 \| 10.0 \| 31250 \| 0.2233 \| 0.9653 \| 0.9654 \| 0.9637 \| 0.9670 \|


	### Framework versions

	- Transformers 4.39.2
	- Pytorch 2.2.0+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2