BART-Base XSum Summarization Model

Model Description

The model is a sequence-to-sequence transformer based on the BART architecture. It was fine-tuned on the XSum dataset using the facebook/bart-base model, which consists of news articles paired with short summaries.

Model Training Details

Training Dataset

  • Dataset: XSum
  • Splits:
    • Train: 204,045 examples (filtered to 203,966 examples)
    • Validation: 11,332 examples (filtered to 11,326 examples)
    • Test: 11,334 examples (filtered to 11,331 examples)
  • Preprocessing:
    • Tokenization of documents and summaries using the facebook/bart-base tokenizer.
    • Filtering out examples with very short documents or summaries.
    • Truncating inputs to a maximum length of 1024 tokens for documents and 512 tokens for summaries.

Training Configuration

The model was fine-tuned using the Seq2SeqTrainer from the Hugging Face Transformers library with the following training arguments:

  • Evaluation Strategy: Evaluation at the end of each epoch
  • Learning Rate: 3e-5
  • Batch Size:
    • Training: 16 per device
    • Evaluation: 32 per device
  • Gradient Accumulation Steps: 1
  • Weight Decay: 0.01
  • Number of Epochs: 5
  • Warmup Steps: 1000
  • Learning Rate Scheduler: Cosine scheduler
  • Label Smoothing Factor: 0.1
  • Mixed Precision: FP16 enabled
  • Prediction: Uses predict_with_generate to compute summaries during evaluation
  • Metric for Best Model: rougeL

Model Results

Evaluation Metrics

After fine-tuning, the model achieved the following scores:

  • Validation Set:
    • Eval Loss: 3.0508
    • ROUGE-1: 39.2079
    • ROUGE-2: 17.8686
    • ROUGE-L: 32.4777
    • ROUGE-Lsum: 32.4734
  • Test Set:
    • Eval Loss: 3.0607
    • ROUGE-1: 39.2149
    • ROUGE-2: 17.7573
    • ROUGE-L: 32.4190
    • ROUGE-Lsum: 32.4020

Final Training Loss

  • Final Training Loss: 2.9226
  • Final Validation Loss: 3.0508

Model Usage

You can easily use the model for summarization tasks using the Hugging Face pipeline. Below is an example:

from transformers import pipeline

# Load the summarization pipeline using the fine-tuned model
summarizer = pipeline("summarization", model="Prikshit7766/bart-base-xsum")

# Input text for summarization
text = (
    "In a significant breakthrough in renewable energy, scientists have developed "
    "a novel solar panel technology that promises to dramatically reduce costs and "
    "increase efficiency. The new panels are lighter, more durable, and easier to install "
    "than conventional models, marking a major advancement in sustainable energy solutions. "
    "Experts believe this innovation could lead to wider adoption of solar power across residential "
    "and commercial sectors, ultimately reducing global reliance on fossil fuels."
)

# Generate summary
summary = summarizer(text)[0]["summary_text"]
print("Generated Summary:", summary)

Example Output:

Generated Summary: Scientists at the University of California, Berkeley, have developed a new type of solar panel.
Downloads last month
0
Safetensors
Model size
139M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Prikshit7766/bart-base-xsum

Base model

facebook/bart-base
Finetuned
(376)
this model

Dataset used to train Prikshit7766/bart-base-xsum