devvanshhh
/

flan-xl-gen6

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

8-bit precision

Model card Files Files and versions Metrics Training metrics Community

flan-xl-gen6

This model is a fine-tuned version of ybelkada/flan-t5-xl-sharded-bf16 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.4978
Rouge1: 29.5362
Rouge2: 20.6621
Rougel: 25.7689
Rougelsum: 26.2351
Gen Len: 12.7388

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 800
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	1.0	328	0.6921	34.9112	26.7503	31.4124	31.7295	10.0172
6.8746	2.0	656	0.6025	33.9134	25.3236	30.1968	30.472	10.8454
6.8746	3.0	984	0.5687	31.6178	22.9463	27.8758	28.3572	11.8729
0.6462	4.0	1312	0.5355	30.8157	22.1783	27.1641	27.569	12.1306
0.5618	5.0	1640	0.5160	29.9183	21.0842	26.1671	26.5965	12.5017
0.5618	6.0	1968	0.5025	29.7823	21.1443	26.0286	26.5215	12.5086
0.498	7.0	2296	0.4978	29.1043	20.2391	25.3347	25.804	12.8969
0.4551	8.0	2624	0.4978	29.5362	20.6621	25.7689	26.2351	12.7388

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu118
Datasets 2.15.0
Tokenizers 0.15.0

Downloads last month: 90

Safetensors

Model size

2.85B params

Tensor type

F32

·

FP16

·

I8

·

Inference Providers NEW

Text2Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for devvanshhh/flan-xl-gen6

Base model

ybelkada/flan-t5-xl-sharded-bf16

Quantized

(2)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard