BART-Base XSum Summarization Model

Model Description

The model is a sequence-to-sequence transformer based on the BART architecture. It was fine-tuned on the XSum dataset using the facebook/bart-base model, which consists of news articles paired with short summaries.

Model Training Details

Training Dataset

Dataset: XSum
Splits:
- Train: 204,045 examples (filtered to 203,966 examples)
- Validation: 11,332 examples (filtered to 11,326 examples)
- Test: 11,334 examples (filtered to 11,331 examples)
Preprocessing:
- Tokenization of documents and summaries using the facebook/bart-base tokenizer.
- Filtering out examples with very short documents or summaries.
- Truncating inputs to a maximum length of 1024 tokens for documents and 512 tokens for summaries.

Training Configuration

The model was fine-tuned using the Seq2SeqTrainer from the Hugging Face Transformers library with the following training arguments:

Evaluation Strategy: Evaluation at the end of each epoch
Learning Rate: 3e-5
Batch Size:
- Training: 16 per device
- Evaluation: 32 per device
Gradient Accumulation Steps: 1
Weight Decay: 0.01
Number of Epochs: 5
Warmup Steps: 1000
Learning Rate Scheduler: Cosine scheduler
Label Smoothing Factor: 0.1
Mixed Precision: FP16 enabled
Prediction: Uses predict_with_generate to compute summaries during evaluation
Metric for Best Model: rougeL

Model Results

Evaluation Metrics

After fine-tuning, the model achieved the following scores:

Validation Set:
- Eval Loss: 3.0508
- ROUGE-1: 39.2079
- ROUGE-2: 17.8686
- ROUGE-L: 32.4777
- ROUGE-Lsum: 32.4734
Test Set:
- Eval Loss: 3.0607
- ROUGE-1: 39.2149
- ROUGE-2: 17.7573
- ROUGE-L: 32.4190
- ROUGE-Lsum: 32.4020

Final Training Loss

Final Training Loss: 2.9226
Final Validation Loss: 3.0508

Model Usage

You can easily use the model for summarization tasks using the Hugging Face pipeline. Below is an example:

from transformers import pipeline

# Load the summarization pipeline using the fine-tuned model
summarizer = pipeline("summarization", model="Prikshit7766/bart-base-xsum")

# Input text for summarization
text = (
    "In a significant breakthrough in renewable energy, scientists have developed "
    "a novel solar panel technology that promises to dramatically reduce costs and "
    "increase efficiency. The new panels are lighter, more durable, and easier to install "
    "than conventional models, marking a major advancement in sustainable energy solutions. "
    "Experts believe this innovation could lead to wider adoption of solar power across residential "
    "and commercial sectors, ultimately reducing global reliance on fossil fuels."
)

# Generate summary
summary = summarizer(text)[0]["summary_text"]
print("Generated Summary:", summary)

Example Output:

Generated Summary: Scientists at the University of California, Berkeley, have developed a new type of solar panel.

Prikshit7766
/

bart-base-xsum