model-1 / README.md
privetin's picture
Update README.md
a5d5962 verified
metadata
license: mit
datasets:
  - abisee/cnn_dailymail
language:
  - en
metrics:
  - rouge
  - bleu
base_model:
  - google-t5/t5-small
pipeline_tag: summarization
library_name: transformers

Model Card for t5_small Summarization Model

Model Details

  • Model Architecture: T5 (Text-to-Text Transfer Transformer)
  • Variant: t5-small
  • Task: Text Summarization
  • Framework: Hugging Face Transformers

Training Data

  • Dataset: CNN/DailyMail
  • Content: News articles and their summaries
  • Size: Approximately 300,000 article-summary pairs

Training Procedure

  • Fine-tuning method: Using Hugging Face Transformers library
  • Hyperparameters:
    • Learning rate: 5e-5
    • Batch size: 8
    • Number of epochs: 3
  • Optimizer: AdamW

How to Use

  1. Install the Hugging Face Transformers library:
pip install transformers
  1. Load the model:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("t5-small")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
  1. Generate a summary:
input_text = "Your input text here"
inputs = tokenizer("summarize: " + input_text, return_tensors="pt", max_length=512, truncation=True)
summary_ids = model.generate(inputs["input_ids"], max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

Evaluation

  • Metric: ROUGE scores (Recall-Oriented Understudy for Gisting Evaluation)
  • Exact scores not available, but typically evaluated on:
    • ROUGE-1 (unigram overlap)
    • ROUGE-2 (bigram overlap)
    • ROUGE-L (longest common subsequence)

Limitations

  • Performance may be lower compared to larger T5 variants
  • Optimized for news article summarization, may not perform as well on other text types
  • Limited to input sequences of 512 tokens
  • Generated summaries may sometimes contain factual inaccuracies

Ethical Considerations

  • May inherit biases present in the CNN/DailyMail dataset
  • Not suitable for summarizing sensitive or critical information without human review
  • Users should be aware of potential biases and inaccuracies in generated summaries
  • Should not be used as a sole source of information for decision-making processes