fawzanaramam
/

Whisper-Small-Finetuned-on-Surah-Fatiha

@@ -4,7 +4,11 @@ language:
 license: apache-2.0
 base_model: openai/whisper-small
 tags:
-- generated_from_trainer
 datasets:
 - fawzanaramam/the-truth-1st-chapter
 metrics:
@@ -20,67 +24,76 @@ model-index:
       type: fawzanaramam/the-truth-1st-chapter
       args: 'config: ar, split: train'
     metrics:
-    - name: Wer
       type: wer
       value: 0.0
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # Whisper Small Finetuned on Surah Fatiha
-This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the The Truth 2.0 - Surah Fatiha dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0088
-- Wer: 0.0
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 16
-- eval_batch_size: 8
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 10
-- training_steps: 100
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Wer     |
-|:-------------:|:------:|:----:|:---------------:|:-------:|
-| No log        | 0.5556 | 10   | 1.1057          | 96.2766 |
-| No log        | 1.1111 | 20   | 0.3582          | 29.7872 |
-| 0.6771        | 1.6667 | 30   | 0.1882          | 23.4043 |
-| 0.6771        | 2.2222 | 40   | 0.0928          | 25.0    |
-| 0.0289        | 2.7778 | 50   | 0.0660          | 34.0426 |
-| 0.0289        | 3.3333 | 60   | 0.0484          | 32.9787 |
-| 0.0289        | 3.8889 | 70   | 0.0241          | 25.5319 |
-| 0.0056        | 4.4444 | 80   | 0.0184          | 28.7234 |
-| 0.0056        | 5.0    | 90   | 0.0111          | 0.0     |
-| 0.0019        | 5.5556 | 100  | 0.0088          | 0.0     |
-### Framework versions
-- Transformers 4.41.1
-- Pytorch 2.2.1+cu121
-- Datasets 2.19.1
-- Tokenizers 0.19.1

 license: apache-2.0
 base_model: openai/whisper-small
 tags:
+- fine-tuned
+- Quran
+- automatic-speech-recognition
+- arabic
+- whisper
 datasets:
 - fawzanaramam/the-truth-1st-chapter
 metrics:
       type: fawzanaramam/the-truth-1st-chapter
       args: 'config: ar, split: train'
     metrics:
+    - name: Word Error Rate (WER)
       type: wer
       value: 0.0
 ---
 # Whisper Small Finetuned on Surah Fatiha
+This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small), transcribing Surah Fatiha, the first chapter of the Quran. It has been trained using *The Truth 2.0 - Surah Fatiha* dataset and achieves excellent results with a Word Error Rate (WER) of **0.0**, indicating perfect transcription on the evaluation set.
+## Model Description
+Whisper Small is a transformer-based automatic speech recognition (ASR) model developed by OpenAI. By fine-tuning it on the *Surah Fatiha* dataset, this model becomes highly accurate in transcribing Quranic recitation. It is designed to assist in religious, educational, and research-oriented tasks that require precise Quranic transcription.
+## Performance Metrics
+On the evaluation set, the model achieved:
+- **Loss**: 0.0088
+- **Word Error Rate (WER)**: 0.0
+These metrics showcase the model's exceptional performance and reliability in transcribing Surah Fatiha audio.
+## Training Results
+The following table summarizes the training process and results:
+| **Training Loss** | **Epoch** | **Step** | **Validation Loss** | **WER**    |
+|:------------------:|:---------:|:--------:|:-------------------:|:----------:|
+| No log            | 0.5556    | 10       | 1.1057              | 96.2766    |
+| No log            | 1.1111    | 20       | 0.3582              | 29.7872    |
+| 0.6771            | 1.6667    | 30       | 0.1882              | 23.4043    |
+| 0.6771            | 2.2222    | 40       | 0.0928              | 25.0       |
+| 0.0289            | 2.7778    | 50       | 0.0660              | 34.0426    |
+| 0.0289            | 3.3333    | 60       | 0.0484              | 32.9787    |
+| 0.0289            | 3.8889    | 70       | 0.0241              | 25.5319    |
+| 0.0056            | 4.4444    | 80       | 0.0184              | 28.7234    |
+| 0.0056            | 5.0       | 90       | 0.0111              | 0.0        |
+| 0.0019            | 5.5556    | 100      | 0.0088              | 0.0        |
+## Intended Uses & Limitations
+### Intended Uses
+- **Speech-to-text transcription** of Quranic recitation for Surah Fatiha.
+- Educational tools to assist in learning and practicing Quranic recitation.
+- Research and analysis of Quranic audio transcription methods.
+### Limitations
+- This model is fine-tuned specifically for Surah Fatiha and may not generalize well to other chapters or non-Quranic Arabic audio.
+- Variability in audio quality, accents, or recitation styles might affect performance.
+- Optimal performance is achieved with high-quality audio inputs.
+## Training and Evaluation Data
+The model was trained on *The Truth 2.0 - Surah Fatiha* dataset, which comprises high-quality audio recordings of Surah Fatiha and their corresponding transcripts. The dataset was meticulously curated to ensure the accuracy and authenticity of Quranic content.
+## Training Procedure
+### Training Hyperparameters
 The following hyperparameters were used during training:
+- **Learning Rate**: 1e-05
+- **Training Batch Size**: 16
+- **Evaluation Batch Size**: 8
+- **Seed**: 42
+- **Optimizer**: Adam (betas=(0.9, 0.999), epsilon=1e-08)
+- **Learning Rate Scheduler**: Linear
+- **Warmup Steps**: 10
+- **Training Steps**: 100
+- **Mixed Precision Training**: Native AMP
+### Framework Versions
+- **Transformers**: 4.41.1
+- **PyTorch**: 2.2.1+cu121
+- **Datasets**: 2.19.1
+- **Tokenizers**: 0.19.1