Training with Drishtant's new summaries

Browse files

Files changed (3) hide show

README.md +84 -0
generation_config.json +6 -0
runs/Jan06_16-01-28_8f683b5e421f/events.out.tfevents.1736180877.8f683b5e421f.344.1 +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,84 @@

+---
+library_name: transformers
+license: apache-2.0
+base_model: google/mt5-small
+tags:
+- summarization
+- generated_from_trainer
+metrics:
+- rouge
+model-index:
+- name: mt5-small-finetuned-Drishtants-summaries
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mt5-small-finetuned-Drishtants-summaries
+This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.7354
+- Rouge1: 0.0368
+- Rouge2: 0.0186
+- Rougel: 0.0344
+- Rougelsum: 0.0335
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5.6e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 20
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
+| 21.5975       | 1.0   | 18   | 11.3745         | 0.0    | 0.0    | 0.0    | 0.0       |
+| 18.236        | 2.0   | 36   | 10.6384         | 0.0102 | 0.0    | 0.0102 | 0.0102    |
+| 15.5742       | 3.0   | 54   | 9.9438          | 0.0201 | 0.0084 | 0.0160 | 0.0160    |
+| 13.9098       | 4.0   | 72   | 8.4129          | 0.0250 | 0.0084 | 0.0209 | 0.0245    |
+| 12.3694       | 5.0   | 90   | 6.1433          | 0.0250 | 0.0084 | 0.0209 | 0.0245    |
+| 10.7687       | 6.0   | 108  | 6.1726          | 0.0102 | 0.0    | 0.0102 | 0.0102    |
+| 9.4084        | 7.0   | 126  | 5.0390          | 0.0102 | 0.0    | 0.0102 | 0.0102    |
+| 8.532         | 8.0   | 144  | 4.2376          | 0.0135 | 0.0    | 0.0139 | 0.0127    |
+| 7.7273        | 9.0   | 162  | 3.8524          | 0.0268 | 0.0    | 0.0259 | 0.0258    |
+| 6.9872        | 10.0  | 180  | 3.6113          | 0.0512 | 0.0070 | 0.0462 | 0.0425    |
+| 6.4007        | 11.0  | 198  | 3.3596          | 0.0489 | 0.0118 | 0.0500 | 0.0496    |
+| 6.021         | 12.0  | 216  | 3.2024          | 0.0441 | 0.0173 | 0.0403 | 0.0392    |
+| 5.6179        | 13.0  | 234  | 3.1161          | 0.0498 | 0.0173 | 0.0466 | 0.0454    |
+| 5.2275        | 14.0  | 252  | 3.0076          | 0.0411 | 0.0217 | 0.0396 | 0.0382    |
+| 4.9888        | 15.0  | 270  | 2.9215          | 0.0449 | 0.0217 | 0.0434 | 0.0433    |
+| 4.7543        | 16.0  | 288  | 2.8484          | 0.0368 | 0.0186 | 0.0344 | 0.0335    |
+| 4.5961        | 17.0  | 306  | 2.7913          | 0.0368 | 0.0186 | 0.0344 | 0.0335    |
+| 4.4748        | 18.0  | 324  | 2.7587          | 0.0368 | 0.0186 | 0.0344 | 0.0335    |
+| 4.4339        | 19.0  | 342  | 2.7405          | 0.0368 | 0.0186 | 0.0344 | 0.0335    |
+| 4.4859        | 20.0  | 360  | 2.7354          | 0.0368 | 0.0186 | 0.0344 | 0.0335    |
+### Framework versions
+- Transformers 4.47.1
+- Pytorch 2.5.1+cu121
+- Datasets 3.2.0
+- Tokenizers 0.21.0

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "pad_token_id": 0,
+  "transformers_version": "4.47.1"
+}

runs/Jan06_16-01-28_8f683b5e421f/events.out.tfevents.1736180877.8f683b5e421f.344.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3b41f320b8b3353a08313a0fef0e9b2558488ca11bd99b0de91ddc0d2a0ed80c
+size 562