whisper-base / README.md
SadeghK's picture
End of training
7f064e2 verified
|
raw
history blame
4.39 kB
metadata
library_name: transformers
language:
  - fa
license: apache-2.0
base_model: openai/whisper-base
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_20_0
metrics:
  - wer
model-index:
  - name: whisper-base-fa - Sadegh Karimi
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 20.0
          type: mozilla-foundation/common_voice_20_0
          args: 'config: fa, split: train, test'
        metrics:
          - name: Wer
            type: wer
            value: 18.008111901053315

whisper-base-fa - Sadegh Karimi

This model is a fine-tuned version of openai/whisper-base on the Common Voice 20.0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1400
  • Wer: 18.0081

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 40000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.524 0.0493 1000 0.5244 54.4099
0.4158 0.0986 2000 0.4063 45.3387
0.3568 0.1479 3000 0.3515 39.9380
0.3243 0.1972 4000 0.3176 36.2121
0.2978 0.2465 5000 0.2894 34.1671
0.2703 0.2958 6000 0.2691 32.6126
0.2591 0.3451 7000 0.2522 30.4674
0.2728 0.3944 8000 0.2388 29.0826
0.2299 0.4437 9000 0.2297 27.9737
0.2368 0.4930 10000 0.2186 26.9358
0.1997 0.5423 11000 0.2116 26.3267
0.2082 0.5916 12000 0.2052 25.6820
0.2131 0.6409 13000 0.2000 25.1361
0.1955 0.6902 14000 0.1966 24.4390
0.1945 0.7395 15000 0.1949 24.3110
0.2332 0.7888 16000 0.1985 25.1515
0.2037 0.8381 17000 0.1915 24.7845
0.2151 0.8874 18000 0.1869 24.0242
0.1982 0.9367 19000 0.1822 23.0002
0.1643 0.9860 20000 0.1776 22.7580
0.1388 1.0353 21000 0.1745 22.5051
0.1521 1.0846 22000 0.1715 22.1026
0.1404 1.1339 23000 0.1694 21.8158
0.1561 1.1832 24000 0.1680 21.8574
0.1349 1.2325 25000 0.1671 21.8960
0.1409 1.2818 26000 0.1728 22.0903
0.1587 1.3311 27000 0.1707 22.5329
0.1415 1.3804 28000 0.1658 21.4672
0.1553 1.4297 29000 0.1616 21.4503
0.1313 1.4790 30000 0.1589 20.6576
0.1358 1.5283 31000 0.1559 20.1471
0.1435 1.5776 32000 0.1521 19.7323
0.1341 1.6269 33000 0.1501 19.6027
0.1376 1.6762 34000 0.1481 18.8748
0.1232 1.7255 35000 0.1462 18.8486
0.1137 1.7748 36000 0.1441 18.6250
0.1149 1.8241 37000 0.1425 18.4122
0.1173 1.8734 38000 0.1415 18.2502
0.1253 1.9227 39000 0.1404 17.9233
0.1136 1.9720 40000 0.1400 18.0081

Framework versions

  • Transformers 4.48.2
  • Pytorch 2.1.0+cu118
  • Datasets 3.2.0
  • Tokenizers 0.21.0