Update README.md
Browse files
README.md
CHANGED
@@ -111,6 +111,32 @@ The model has been fine-tuned on a mix of authentic human and synthetic speech a
|
|
111 |
- num_train_epochs: 3 (975 training steps)
|
112 |
- warmup_steps: 0
|
113 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
114 |
## License
|
115 |
|
116 |
The fine-tuned model is licensed under the same Apache-2.0 license agreement as the original `openai/whisper-small` checkpoint.
|
|
|
111 |
- num_train_epochs: 3 (975 training steps)
|
112 |
- warmup_steps: 0
|
113 |
|
114 |
+
The table below demonstrates the values of both training and validation losses as well as the BLEU score calculated on the development set during the fine-tuning. The model converged at step 900, or approximately epoch 3, and clearly started to overfit the dataset afterwards.
|
115 |
+
| Step | Training loss | Validation loss | BLEU |
|
116 |
+
| :---: | :---: | :---: | :---: |
|
117 |
+
| 100 | 2.491100 | 2.007935 | 21.813000 |
|
118 |
+
| 200 | 1.600800 | 1.383696 | 23.344800 |
|
119 |
+
| 300 | 1.430900 | 1.309672 | 23.846300 |
|
120 |
+
| 400 | 1.320600 | 1.268230 | 23.911000 |
|
121 |
+
| 500 | 1.289200 | 1.248684 | 24.192300 |
|
122 |
+
| 600 | 1.243800 | 1.239911 | 24.385900 |
|
123 |
+
| 700 | 1.194200 | 1.207502 | 23.941100 |
|
124 |
+
| 800 | 1.170800 | 1.211733 | 24.888100 |
|
125 |
+
| 900 | 1.143800 | 1.199629 | 24.946900 |
|
126 |
+
| 1000 | 1.153400 | 1.206929 | 24.919100 |
|
127 |
+
| 1100 | 1.119200 | 1.201825 | 24.597300 |
|
128 |
+
|
129 |
+
## Evaluation
|
130 |
+
|
131 |
+
Both original and fine-tuned checkpoints have been evaluated on the test split of the dataset. The selected evaluation metrics are BLEU and ChrF++ implemented in `sacrebleu` library.
|
132 |
+
|
133 |
+
| Model | BLEU | ChrF++ |
|
134 |
+
| :---: | :---: | :---: |
|
135 |
+
| `whisper-small` | 16.36 | 43.81 |
|
136 |
+
| `checkpoint-900` | 22.34 | 48.1 |
|
137 |
+
|
138 |
+
The fine-tuning improved the model's performance compared to the baseline score by almost 6 points.
|
139 |
+
|
140 |
## License
|
141 |
|
142 |
The fine-tuned model is licensed under the same Apache-2.0 license agreement as the original `openai/whisper-small` checkpoint.
|