--- datasets: - EdinburghNLP/xsum language: - en metrics: - rouge base_model: - facebook/bart-base pipeline_tag: summarization library_name: transformers --- # BART-Base XSum Summarization Model ## Model Description The model is a sequence-to-sequence transformer based on the BART architecture. It was fine-tuned on the [XSum](https://huggingface.co./datasets/EdinburghNLP/xsum) dataset using the `facebook/bart-base` model, which consists of news articles paired with short summaries. ## Model Training Details ### Training Dataset - **Dataset:** [XSum](https://huggingface.co./datasets/EdinburghNLP/xsum) - **Splits:** - **Train:** 204,045 examples (filtered to 203,966 examples) - **Validation:** 11,332 examples (filtered to 11,326 examples) - **Test:** 11,334 examples (filtered to 11,331 examples) - **Preprocessing:** - Tokenization of documents and summaries using the `facebook/bart-base` tokenizer. - Filtering out examples with very short documents or summaries. - Truncating inputs to a maximum length of 1024 tokens for documents and 512 tokens for summaries. ### Training Configuration The model was fine-tuned using the `Seq2SeqTrainer` from the Hugging Face Transformers library with the following training arguments: - **Evaluation Strategy:** Evaluation at the end of each epoch - **Learning Rate:** 3e-5 - **Batch Size:** - **Training:** 16 per device - **Evaluation:** 32 per device - **Gradient Accumulation Steps:** 1 - **Weight Decay:** 0.01 - **Number of Epochs:** 5 - **Warmup Steps:** 1000 - **Learning Rate Scheduler:** Cosine scheduler - **Label Smoothing Factor:** 0.1 - **Mixed Precision:** FP16 enabled - **Prediction:** Uses `predict_with_generate` to compute summaries during evaluation - **Metric for Best Model:** `rougeL` ## Model Results ### Evaluation Metrics After fine-tuning, the model achieved the following scores: - **Validation Set:** - **Eval Loss:** 3.0508 - **ROUGE-1:** 39.2079 - **ROUGE-2:** 17.8686 - **ROUGE-L:** 32.4777 - **ROUGE-Lsum:** 32.4734 - **Test Set:** - **Eval Loss:** 3.0607 - **ROUGE-1:** 39.2149 - **ROUGE-2:** 17.7573 - **ROUGE-L:** 32.4190 - **ROUGE-Lsum:** 32.4020 ### Final Training Loss - **Final Training Loss:** 2.9226 - **Final Validation Loss:** 3.0508 ## Model Usage You can easily use the model for summarization tasks using the Hugging Face `pipeline`. Below is an example: ```python from transformers import pipeline # Load the summarization pipeline using the fine-tuned model summarizer = pipeline("summarization", model="Prikshit7766/bart-base-xsum") # Input text for summarization text = ( "In a significant breakthrough in renewable energy, scientists have developed " "a novel solar panel technology that promises to dramatically reduce costs and " "increase efficiency. The new panels are lighter, more durable, and easier to install " "than conventional models, marking a major advancement in sustainable energy solutions. " "Experts believe this innovation could lead to wider adoption of solar power across residential " "and commercial sectors, ultimately reducing global reliance on fossil fuels." ) # Generate summary summary = summarizer(text)[0]["summary_text"] print("Generated Summary:", summary) ``` **Example Output:** ``` Generated Summary: Scientists at the University of California, Berkeley, have developed a new type of solar panel. ```