bhargavis/fewshot-xsum-bart

Model Description

Model Name- fewshot-xsum-bart
Base Model- facebook/bart-large
Task- Summarization (Few-Shot Learning)

Dataset: XSUM (Extreme Summarization Dataset)

Few-Shot Setup: Trained on 100 samples from the XSUM training set and validated on 50 samples from the XSUM validation set.
This model is a few-shot learning variant of the BART-large model, fine-tuned on a very small subset of the XSUM dataset.
The purpose of this model is to demonstrate the effectiveness of few-shot learning in summarization tasks where only a limited amount of labeled data is available.

Purpose

The goal of this model is to explore how well a large pre-trained language model like BART can perform on abstractive summarization when fine-tuned with very limited data (few-shot learning). By training on only 100 samples and validating on 50 samples, this model serves as a proof of concept for few-shot summarization tasks.

Training Set: 100 samples (randomly selected from the XSUM training set).
Validation Set: 50 samples (randomly selected from the XSUM validation set).

The small dataset size is intentional, as the focus is on few-shot learning rather than large-scale training.

Base Model: facebook/bart-large (pre-trained on large corpora).
Fine-Tuning:
- Epochs: 3
- Batch Size: 8
- Learning Rate: 5e-5
- Max Input Length: 512 tokens
- Max Output Length: 64 tokens

Performance

Due to the few-shot nature of this model, its performance is not directly comparable to models trained on the full XSUM dataset. However, it demonstrates the potential of few-shot learning for summarization tasks. Key metrics on the validation set (50 samples) include:

Few-shot learning model

ROUGE Scores:
- ROUGE-1: 0.34979462836539676
- ROUGE-2: 0.1307846421186083
- ROUGE-L: 0.27450996607520567
BLEU Score: 6.176957339134279

Zero-shot/Baseline model

ROUGE Scores:
- ROUGE-1: 0.15600324782737301
- ROUGE-2: 0.017444778781163447
- ROUGE-L: 0.12044578560849475
BLEU Score: 0.6167333943579659

Usage

Use this model for few-shot abstractive summarization tasks. Below is an example of how to load and use the model:

from transformers import pipeline

# Load the few-shot model
summarizer = pipeline("summarization", model="bhargavis/fewshot-xsum-bart")

# Provide input text
input_text = """
Authorities have issued a warning after multiple sightings of a large brown bear in the woods. The bear is known to become aggressive if disturbed, and residents are urged to exercise caution. Last week, a group of hikers reported a close encounter with the animal. While no injuries were sustained, the bear displayed defensive behavior when approached. Wildlife officials advise keeping a safe distance and avoiding the area if possible. Those encountering the bear should remain calm, back away slowly, and refrain from making sudden movements. Officials continue to monitor the situation.
"""

# Generate summary
summary = summarizer(input_text, max_length=64, min_length=30, do_sample=False)
print(summary[0]["summary_text"])

Limitations

The model is trained on a very small dataset so its performance may not generalize well to all types of text.
The purpose of building this model is to compare its performace with Zero-shot and Full-Shot learning model
It is best suited for tasks where only limited labeled data is available.
The model is fine-tuned on BBC articles from the XSUM dataset. Its performance may vary on text from other domains.
The model may overfit to the training data due to the small dataset size.

Full-Shot learning model- For a more general-purpose summarization model, check out the full model trained on the entire XSUM dataset: [WIP].

Citation

If you use this model in your research please cite it as follows:

@misc{fewshot-xsum-bart,
  author = {Bhargavi Sriram},
  title = {Few-Shot Abstractive Summarization with BART-Large},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/bhargavis/fewshot-xsum-bart}},
}

bhargavis
/

fewshot-xsum-bart