jordanfan
/

bart_extractive_1024_750

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

bart_extractive_1024_750 / README.md

jordanfan's picture

training completed[dev]: 1024 128

a8aafd5 verified 7 months ago

|

history blame contribute delete

2.98 kB

	---
	license: apache-2.0
	base_model: facebook/bart-large
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	- wer
	model-index:
	- name: bart_extractive_1024_750
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bart_extractive_1024_750

	This model is a fine-tuned version of [facebook/bart-large](https://huggingface.co./facebook/bart-large) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8901
	- Rouge1: 0.7176
	- Rouge2: 0.4726
	- Rougel: 0.6632
	- Rougelsum: 0.6633
	- Wer: 0.4177

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 6
	- eval_batch_size: 6
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 2
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Wer \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|:------:\|
	\| No log \| 0.13 \| 250 \| 1.1639 \| 0.6758 \| 0.4064 \| 0.6138 \| 0.6136 \| 0.4827 \|
	\| 2.044 \| 0.27 \| 500 \| 1.0693 \| 0.6853 \| 0.4267 \| 0.6258 \| 0.6256 \| 0.4594 \|
	\| 2.044 \| 0.4 \| 750 \| 1.0210 \| 0.6982 \| 0.4409 \| 0.6399 \| 0.6399 \| 0.452 \|
	\| 1.1195 \| 0.53 \| 1000 \| 0.9865 \| 0.6989 \| 0.4442 \| 0.64 \| 0.64 \| 0.4449 \|
	\| 1.1195 \| 0.66 \| 1250 \| 0.9697 \| 0.7007 \| 0.4476 \| 0.643 \| 0.6429 \| 0.4407 \|
	\| 1.0531 \| 0.8 \| 1500 \| 0.9680 \| 0.7009 \| 0.4495 \| 0.6451 \| 0.645 \| 0.4384 \|
	\| 1.0531 \| 0.93 \| 1750 \| 0.9346 \| 0.7099 \| 0.4587 \| 0.6538 \| 0.6539 \| 0.4323 \|
	\| 1.0109 \| 1.06 \| 2000 \| 0.9249 \| 0.7066 \| 0.4589 \| 0.6519 \| 0.6518 \| 0.4295 \|
	\| 1.0109 \| 1.2 \| 2250 \| 0.9221 \| 0.7092 \| 0.4627 \| 0.6541 \| 0.654 \| 0.427 \|
	\| 0.9199 \| 1.33 \| 2500 \| 0.9117 \| 0.7134 \| 0.4668 \| 0.6583 \| 0.6582 \| 0.424 \|
	\| 0.9199 \| 1.46 \| 2750 \| 0.9064 \| 0.7147 \| 0.4676 \| 0.6593 \| 0.6592 \| 0.4225 \|
	\| 0.9164 \| 1.6 \| 3000 \| 0.8996 \| 0.7164 \| 0.4701 \| 0.6612 \| 0.6611 \| 0.4212 \|
	\| 0.9164 \| 1.73 \| 3250 \| 0.9006 \| 0.714 \| 0.4695 \| 0.6602 \| 0.6601 \| 0.4201 \|
	\| 0.8861 \| 1.86 \| 3500 \| 0.8893 \| 0.7176 \| 0.4735 \| 0.6635 \| 0.6635 \| 0.4176 \|
	\| 0.8861 \| 1.99 \| 3750 \| 0.8901 \| 0.7176 \| 0.4726 \| 0.6632 \| 0.6633 \| 0.4177 \|


	### Framework versions

	- Transformers 4.38.2
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2