BART-Base XSum Summarization Model
Model Description
The model is a sequence-to-sequence transformer based on the BART architecture. It was fine-tuned on the XSum dataset using the facebook/bart-base
model, which consists of news articles paired with short summaries.
Model Training Details
Training Dataset
- Dataset: XSum
- Splits:
- Train: 204,045 examples (filtered to 203,966 examples)
- Validation: 11,332 examples (filtered to 11,326 examples)
- Test: 11,334 examples (filtered to 11,331 examples)
- Preprocessing:
- Tokenization of documents and summaries using the
facebook/bart-base
tokenizer. - Filtering out examples with very short documents or summaries.
- Truncating inputs to a maximum length of 1024 tokens for documents and 512 tokens for summaries.
- Tokenization of documents and summaries using the
Training Configuration
The model was fine-tuned using the Seq2SeqTrainer
from the Hugging Face Transformers library with the following training arguments:
- Evaluation Strategy: Evaluation at the end of each epoch
- Learning Rate: 3e-5
- Batch Size:
- Training: 16 per device
- Evaluation: 32 per device
- Gradient Accumulation Steps: 1
- Weight Decay: 0.01
- Number of Epochs: 5
- Warmup Steps: 1000
- Learning Rate Scheduler: Cosine scheduler
- Label Smoothing Factor: 0.1
- Mixed Precision: FP16 enabled
- Prediction: Uses
predict_with_generate
to compute summaries during evaluation - Metric for Best Model:
rougeL
Model Results
Evaluation Metrics
After fine-tuning, the model achieved the following scores:
- Validation Set:
- Eval Loss: 3.0508
- ROUGE-1: 39.2079
- ROUGE-2: 17.8686
- ROUGE-L: 32.4777
- ROUGE-Lsum: 32.4734
- Test Set:
- Eval Loss: 3.0607
- ROUGE-1: 39.2149
- ROUGE-2: 17.7573
- ROUGE-L: 32.4190
- ROUGE-Lsum: 32.4020
Final Training Loss
- Final Training Loss: 2.9226
- Final Validation Loss: 3.0508
Model Usage
You can easily use the model for summarization tasks using the Hugging Face pipeline
. Below is an example:
from transformers import pipeline
# Load the summarization pipeline using the fine-tuned model
summarizer = pipeline("summarization", model="Prikshit7766/bart-base-xsum")
# Input text for summarization
text = (
"In a significant breakthrough in renewable energy, scientists have developed "
"a novel solar panel technology that promises to dramatically reduce costs and "
"increase efficiency. The new panels are lighter, more durable, and easier to install "
"than conventional models, marking a major advancement in sustainable energy solutions. "
"Experts believe this innovation could lead to wider adoption of solar power across residential "
"and commercial sectors, ultimately reducing global reliance on fossil fuels."
)
# Generate summary
summary = summarizer(text)[0]["summary_text"]
print("Generated Summary:", summary)
Example Output:
Generated Summary: Scientists at the University of California, Berkeley, have developed a new type of solar panel.
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for Prikshit7766/bart-base-xsum
Base model
facebook/bart-base