privetin
/

model-1

Inference Endpoints

Model card Files Files and versions Community

model-1 / README.md

privetin's picture

Update README.md

a5d5962 verified about 2 months ago

|

history blame contribute delete

2.27 kB

	---
	license: mit
	datasets:
	- abisee/cnn_dailymail
	language:
	- en
	metrics:
	- rouge
	- bleu
	base_model:
	- google-t5/t5-small
	pipeline_tag: summarization
	library_name: transformers
	---
	# Model Card for t5_small Summarization Model

	## Model Details

	- Model Architecture: T5 (Text-to-Text Transfer Transformer)
	- Variant: t5-small
	- Task: Text Summarization
	- Framework: Hugging Face Transformers

	## Training Data

	- Dataset: CNN/DailyMail
	- Content: News articles and their summaries
	- Size: Approximately 300,000 article-summary pairs

	## Training Procedure

	- Fine-tuning method: Using Hugging Face Transformers library
	- Hyperparameters:
	- Learning rate: 5e-5
	- Batch size: 8
	- Number of epochs: 3
	- Optimizer: AdamW

	## How to Use

	1. Install the Hugging Face Transformers library:
	```
	pip install transformers
	```

	2. Load the model:
	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	tokenizer = AutoTokenizer.from_pretrained("t5-small")
	model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
	```

	3. Generate a summary:
	```python
	input_text = "Your input text here"
	inputs = tokenizer("summarize: " + input_text, return_tensors="pt", max_length=512, truncation=True)
	summary_ids = model.generate(inputs["input_ids"], max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True)
	summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
	```

	## Evaluation

	- Metric: ROUGE scores (Recall-Oriented Understudy for Gisting Evaluation)
	- Exact scores not available, but typically evaluated on:
	- ROUGE-1 (unigram overlap)
	- ROUGE-2 (bigram overlap)
	- ROUGE-L (longest common subsequence)

	## Limitations

	- Performance may be lower compared to larger T5 variants
	- Optimized for news article summarization, may not perform as well on other text types
	- Limited to input sequences of 512 tokens
	- Generated summaries may sometimes contain factual inaccuracies

	## Ethical Considerations

	- May inherit biases present in the CNN/DailyMail dataset
	- Not suitable for summarizing sensitive or critical information without human review
	- Users should be aware of potential biases and inaccuracies in generated summaries
	- Should not be used as a sole source of information for decision-making processes