metadata

library_name: transformers
language:
  - fa
license: apache-2.0
base_model: openai/whisper-base
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_20_0
metrics:
  - wer
model-index:
  - name: whisper-base-fa - Sadegh Karimi
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 20.0
          type: mozilla-foundation/common_voice_20_0
          args: 'config: fa, split: train, test'
        metrics:
          - name: Wer
            type: wer
            value: 18.008111901053315

whisper-base-fa - Sadegh Karimi

This model is a fine-tuned version of openai/whisper-base on the Common Voice 20.0 dataset. It achieves the following results on the evaluation set:

Loss: 0.1400
Wer: 18.0081

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 40000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.524	0.0493	1000	0.5244	54.4099
0.4158	0.0986	2000	0.4063	45.3387
0.3568	0.1479	3000	0.3515	39.9380
0.3243	0.1972	4000	0.3176	36.2121
0.2978	0.2465	5000	0.2894	34.1671
0.2703	0.2958	6000	0.2691	32.6126
0.2591	0.3451	7000	0.2522	30.4674
0.2728	0.3944	8000	0.2388	29.0826
0.2299	0.4437	9000	0.2297	27.9737
0.2368	0.4930	10000	0.2186	26.9358
0.1997	0.5423	11000	0.2116	26.3267
0.2082	0.5916	12000	0.2052	25.6820
0.2131	0.6409	13000	0.2000	25.1361
0.1955	0.6902	14000	0.1966	24.4390
0.1945	0.7395	15000	0.1949	24.3110
0.2332	0.7888	16000	0.1985	25.1515
0.2037	0.8381	17000	0.1915	24.7845
0.2151	0.8874	18000	0.1869	24.0242
0.1982	0.9367	19000	0.1822	23.0002
0.1643	0.9860	20000	0.1776	22.7580
0.1388	1.0353	21000	0.1745	22.5051
0.1521	1.0846	22000	0.1715	22.1026
0.1404	1.1339	23000	0.1694	21.8158
0.1561	1.1832	24000	0.1680	21.8574
0.1349	1.2325	25000	0.1671	21.8960
0.1409	1.2818	26000	0.1728	22.0903
0.1587	1.3311	27000	0.1707	22.5329
0.1415	1.3804	28000	0.1658	21.4672
0.1553	1.4297	29000	0.1616	21.4503
0.1313	1.4790	30000	0.1589	20.6576
0.1358	1.5283	31000	0.1559	20.1471
0.1435	1.5776	32000	0.1521	19.7323
0.1341	1.6269	33000	0.1501	19.6027
0.1376	1.6762	34000	0.1481	18.8748
0.1232	1.7255	35000	0.1462	18.8486
0.1137	1.7748	36000	0.1441	18.6250
0.1149	1.8241	37000	0.1425	18.4122
0.1173	1.8734	38000	0.1415	18.2502
0.1253	1.9227	39000	0.1404	17.9233
0.1136	1.9720	40000	0.1400	18.0081

Framework versions

Transformers 4.48.2
Pytorch 2.1.0+cu118
Datasets 3.2.0
Tokenizers 0.21.0