ctsoukala commited on
Commit
aa6de74
·
1 Parent(s): a092095

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -1
README.md CHANGED
@@ -15,5 +15,35 @@ To train a Pomak ASR model, we fine-tuned a Slavic model ([classla/wav2vec2-larg
15
 
16
  ## Recordings
17
 
18
- Fours native Pomak speakers (2 female and 2 male) agreed to read Pomak texts at the ILSP audio-visual studio in Xanthi, Greece.
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  ## Recordings
17
 
18
+ Fours native Pomak speakers (2 female and 2 male) agreed to read Pomak texts at the ILSP audio-visual studio in Xanthi, Greece, resulting in a total of 14h.
19
 
20
+ |Speaker|Gender|Total recorded hours|
21
+ |---|---|---|
22
+ |NK9dIF | F | 4h 44m 45s|
23
+ |xoVY9q | M | 4h 36m 12s|
24
+ |9G75fk | F | 1h 44m 03s|
25
+ |n5WzHj | M | 3h 44m 04s|
26
+
27
+ To fine-tune the model, we split the long recordings into smaller segments of a maximum of 25 seconds each.
28
+ This removed the majority of pauses and resulted in a total dataset duration of 11h 8m.
29
+
30
+ ## Metrics
31
+
32
+ The test set consists of 10% of the dataset recordings.
33
+
34
+ |Model|CER|WER|
35
+ |---|---|---|
36
+ |pre-trained|87.31%|31.47%|
37
+ |fine-tuned|9.06%|3.12%|
38
+
39
+ ## Training hyperparameters
40
+
41
+ To fine-tune the wav2vec2-large-slavic-parlaspeech-hr model, we used the following hyperparameters:
42
+
43
+ | arg | value |
44
+ |-------------------------------|-------|
45
+ | `per_device_train_batch_size` | 8 |
46
+ | `gradient_accumulation_steps` | 2 |
47
+ | `num_train_epochs` | 35 |
48
+ | `learning_rate` | 3e-4 |
49
+ | `warmup_steps` | 500 |