whisper-base-fa - Sadegh Karimi

This model is a fine-tuned version of SadeghK/whisper-base on the Common Voice 20.0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0813
  • Wer: 10.3712

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Use convert-to-ggml.ipynb to convert to ggml

To run faster with whisper.cpp, use convert-to-ggml.ipynb to convert model. Model is already converted and saved as "ggml-base-fa.bin"

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 50000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.1234 0.0493 1000 0.1698 21.8312
0.1303 0.0986 2000 0.1663 22.9153
0.1241 0.1479 3000 0.1623 20.8843
0.1223 0.1972 4000 0.1616 20.7470
0.1281 0.2465 5000 0.1522 19.3606
0.1111 0.2958 6000 0.1483 20.0901
0.1097 0.3451 7000 0.1452 19.0445
0.1439 0.3944 8000 0.1367 18.0251
0.1053 0.4437 9000 0.1347 17.5902
0.1248 0.4930 10000 0.1281 16.9486
0.1081 0.5423 11000 0.1252 15.9200
0.1062 0.5916 12000 0.1222 15.8167
0.1139 0.6409 13000 0.1181 15.6038
0.1011 0.6902 14000 0.1145 15.0918
0.098 0.7395 15000 0.1141 15.0194
0.1176 0.7888 16000 0.1091 14.1048
0.0933 0.8381 17000 0.1067 13.9028
0.0981 0.8874 18000 0.1042 13.6391
0.0909 0.9367 19000 0.1012 13.2119
0.0714 0.9860 20000 0.1001 13.1826
0.0491 1.0353 21000 0.0985 12.9251
0.059 1.0846 22000 0.0966 12.6799
0.0492 1.1339 23000 0.0959 12.4501
0.0625 1.1832 24000 0.0943 12.5241
0.0429 1.2325 25000 0.0946 12.4424
0.0403 1.2818 26000 0.0931 12.1370
0.0474 1.3311 27000 0.0921 11.7330
0.0484 1.3804 28000 0.0910 11.5710
0.0585 1.4297 29000 0.0896 11.7067
0.0431 1.4790 30000 0.0890 11.3875
0.045 1.5283 31000 0.0875 11.2842
0.0494 1.5776 32000 0.0862 11.5433
0.0448 1.6269 33000 0.0854 11.0282
0.0508 1.6762 34000 0.0849 11.0498
0.0432 1.7255 35000 0.0837 10.7583
0.0356 1.7748 36000 0.0826 10.8339
0.0353 1.8241 37000 0.0819 10.5300
0.043 1.8734 38000 0.0815 10.4838
0.0434 1.9227 39000 0.0812 10.5038
0.0382 1.9720 40000 0.0809 10.4684
0.0342 2.0213 41000 0.0833 10.4853
0.0249 2.0706 42000 0.0841 10.7783
0.0237 2.1199 43000 0.0835 10.5100
0.0282 2.1692 44000 0.0835 10.5563
0.0277 2.2185 45000 0.0830 10.7151
0.0328 2.2678 46000 0.0824 10.3959
0.0268 2.3171 47000 0.0822 10.4560
0.0395 2.3664 48000 0.0817 10.3311
0.0298 2.4157 49000 0.0815 10.4128
0.029 2.4650 50000 0.0813 10.3712

Framework versions

  • Transformers 4.48.2
  • Pytorch 2.1.0+cu118
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
35
Safetensors
Model size
72.6M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for SadeghK/whisper-base

Unable to build the model tree, the base model loops to the model itself. Learn more.

Evaluation results