Prikshit7766 commited on
Commit
382159e
·
verified ·
1 Parent(s): c943a9c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -92
README.md CHANGED
@@ -1,93 +1,105 @@
1
- # BART-Base XSum Summarization Model
2
-
3
- ## Model Description
4
-
5
- The model is a sequence-to-sequence transformer based on the BART architecture. It was fine-tuned on the [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum) dataset using the `facebook/bart-base` model, which consists of news articles paired with short summaries.
6
-
7
- ## Model Training Details
8
-
9
- ### Training Dataset
10
-
11
- - **Dataset:** [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)
12
- - **Splits:**
13
- - **Train:** 204,045 examples (filtered to 203,966 examples)
14
- - **Validation:** 11,332 examples (filtered to 11,326 examples)
15
- - **Test:** 11,334 examples (filtered to 11,331 examples)
16
- - **Preprocessing:**
17
- - Tokenization of documents and summaries using the `facebook/bart-base` tokenizer.
18
- - Filtering out examples with very short documents or summaries.
19
- - Truncating inputs to a maximum length of 1024 tokens for documents and 512 tokens for summaries.
20
-
21
- ### Training Configuration
22
-
23
- The model was fine-tuned using the `Seq2SeqTrainer` from the Hugging Face Transformers library with the following training arguments:
24
-
25
- - **Evaluation Strategy:** Evaluation at the end of each epoch
26
- - **Learning Rate:** 3e-5
27
- - **Batch Size:**
28
- - **Training:** 16 per device
29
- - **Evaluation:** 32 per device
30
- - **Gradient Accumulation Steps:** 1
31
- - **Weight Decay:** 0.01
32
- - **Number of Epochs:** 5
33
- - **Warmup Steps:** 1000
34
- - **Learning Rate Scheduler:** Cosine scheduler
35
- - **Label Smoothing Factor:** 0.1
36
- - **Mixed Precision:** FP16 enabled
37
- - **Prediction:** Uses `predict_with_generate` to compute summaries during evaluation
38
- - **Metric for Best Model:** `rougeL`
39
-
40
- ## Model Results
41
-
42
- ### Evaluation Metrics
43
-
44
- After fine-tuning, the model achieved the following scores:
45
-
46
- - **Validation Set:**
47
- - **Eval Loss:** 3.0508
48
- - **ROUGE-1:** 39.2079
49
- - **ROUGE-2:** 17.8686
50
- - **ROUGE-L:** 32.4777
51
- - **ROUGE-Lsum:** 32.4734
52
- - **Test Set:**
53
- - **Eval Loss:** 3.0607
54
- - **ROUGE-1:** 39.2149
55
- - **ROUGE-2:** 17.7573
56
- - **ROUGE-L:** 32.4190
57
- - **ROUGE-Lsum:** 32.4020
58
-
59
- ### Final Training Loss
60
-
61
- - **Final Training Loss:** 2.9226
62
- - **Final Validation Loss:** 3.0508
63
-
64
- ## Model Usage
65
-
66
- You can easily use the model for summarization tasks using the Hugging Face `pipeline`. Below is an example:
67
-
68
- ```python
69
- from transformers import pipeline
70
-
71
- # Load the summarization pipeline using the fine-tuned model
72
- summarizer = pipeline("summarization", model="Prikshit7766/bart-base-xsum")
73
-
74
- # Input text for summarization
75
- text = (
76
- "In a significant breakthrough in renewable energy, scientists have developed "
77
- "a novel solar panel technology that promises to dramatically reduce costs and "
78
- "increase efficiency. The new panels are lighter, more durable, and easier to install "
79
- "than conventional models, marking a major advancement in sustainable energy solutions. "
80
- "Experts believe this innovation could lead to wider adoption of solar power across residential "
81
- "and commercial sectors, ultimately reducing global reliance on fossil fuels."
82
- )
83
-
84
- # Generate summary
85
- summary = summarizer(text)[0]["summary_text"]
86
- print("Generated Summary:", summary)
87
- ```
88
-
89
- **Example Output:**
90
-
91
- ```
92
- Generated Summary: Scientists at the University of California, Berkeley, have developed a new type of solar panel.
 
 
 
 
 
 
 
 
 
 
 
 
93
  ```
 
1
+ ---
2
+ datasets:
3
+ - EdinburghNLP/xsum
4
+ language:
5
+ - en
6
+ metrics:
7
+ - rouge
8
+ base_model:
9
+ - facebook/bart-base
10
+ pipeline_tag: summarization
11
+ library_name: transformers
12
+ ---
13
+ # BART-Base XSum Summarization Model
14
+
15
+ ## Model Description
16
+
17
+ The model is a sequence-to-sequence transformer based on the BART architecture. It was fine-tuned on the [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum) dataset using the `facebook/bart-base` model, which consists of news articles paired with short summaries.
18
+
19
+ ## Model Training Details
20
+
21
+ ### Training Dataset
22
+
23
+ - **Dataset:** [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)
24
+ - **Splits:**
25
+ - **Train:** 204,045 examples (filtered to 203,966 examples)
26
+ - **Validation:** 11,332 examples (filtered to 11,326 examples)
27
+ - **Test:** 11,334 examples (filtered to 11,331 examples)
28
+ - **Preprocessing:**
29
+ - Tokenization of documents and summaries using the `facebook/bart-base` tokenizer.
30
+ - Filtering out examples with very short documents or summaries.
31
+ - Truncating inputs to a maximum length of 1024 tokens for documents and 512 tokens for summaries.
32
+
33
+ ### Training Configuration
34
+
35
+ The model was fine-tuned using the `Seq2SeqTrainer` from the Hugging Face Transformers library with the following training arguments:
36
+
37
+ - **Evaluation Strategy:** Evaluation at the end of each epoch
38
+ - **Learning Rate:** 3e-5
39
+ - **Batch Size:**
40
+ - **Training:** 16 per device
41
+ - **Evaluation:** 32 per device
42
+ - **Gradient Accumulation Steps:** 1
43
+ - **Weight Decay:** 0.01
44
+ - **Number of Epochs:** 5
45
+ - **Warmup Steps:** 1000
46
+ - **Learning Rate Scheduler:** Cosine scheduler
47
+ - **Label Smoothing Factor:** 0.1
48
+ - **Mixed Precision:** FP16 enabled
49
+ - **Prediction:** Uses `predict_with_generate` to compute summaries during evaluation
50
+ - **Metric for Best Model:** `rougeL`
51
+
52
+ ## Model Results
53
+
54
+ ### Evaluation Metrics
55
+
56
+ After fine-tuning, the model achieved the following scores:
57
+
58
+ - **Validation Set:**
59
+ - **Eval Loss:** 3.0508
60
+ - **ROUGE-1:** 39.2079
61
+ - **ROUGE-2:** 17.8686
62
+ - **ROUGE-L:** 32.4777
63
+ - **ROUGE-Lsum:** 32.4734
64
+ - **Test Set:**
65
+ - **Eval Loss:** 3.0607
66
+ - **ROUGE-1:** 39.2149
67
+ - **ROUGE-2:** 17.7573
68
+ - **ROUGE-L:** 32.4190
69
+ - **ROUGE-Lsum:** 32.4020
70
+
71
+ ### Final Training Loss
72
+
73
+ - **Final Training Loss:** 2.9226
74
+ - **Final Validation Loss:** 3.0508
75
+
76
+ ## Model Usage
77
+
78
+ You can easily use the model for summarization tasks using the Hugging Face `pipeline`. Below is an example:
79
+
80
+ ```python
81
+ from transformers import pipeline
82
+
83
+ # Load the summarization pipeline using the fine-tuned model
84
+ summarizer = pipeline("summarization", model="Prikshit7766/bart-base-xsum")
85
+
86
+ # Input text for summarization
87
+ text = (
88
+ "In a significant breakthrough in renewable energy, scientists have developed "
89
+ "a novel solar panel technology that promises to dramatically reduce costs and "
90
+ "increase efficiency. The new panels are lighter, more durable, and easier to install "
91
+ "than conventional models, marking a major advancement in sustainable energy solutions. "
92
+ "Experts believe this innovation could lead to wider adoption of solar power across residential "
93
+ "and commercial sectors, ultimately reducing global reliance on fossil fuels."
94
+ )
95
+
96
+ # Generate summary
97
+ summary = summarizer(text)[0]["summary_text"]
98
+ print("Generated Summary:", summary)
99
+ ```
100
+
101
+ **Example Output:**
102
+
103
+ ```
104
+ Generated Summary: Scientists at the University of California, Berkeley, have developed a new type of solar panel.
105
  ```