File size: 3,532 Bytes

---
license: cc0-1.0
datasets:
- mah92/Khadijah-FA_EN-Public-Phone-Audio-Dataset
language:
- fa
- en
pipeline_tag: text-to-speech
---
# بسم اله الرحمن الرحیم - هست کلید در گنج حکیم
# Model Card for Khadijah(SA)

This is the first persian/english text-to-speech model using the brand new matcha TTS model.

Much faster and better than VITS.

Works best with the UNIVERSAL_V1_22050Hz hifigan vocoder.

You can test this model [here](https://huggingface.co/spaces/k2-fsa/text-to-speech) under persian+english part.

Enjoy!

## Usage with the Sherpa-onnx repo

Remember to add metadata to onnx file as in:
https://github.com/k2-fsa/icefall/blob/master/egs/ljspeech/TTS/matcha/export_onnx.py#L174

## Usage with the Matcha-TTS repo
1) In matcha/text/cleaners.py, phonemizer.backend.EspeakBackend part:
```
    language="fa",
```

2) pip install piper-phonemize

3) In cleaners.py:

add below persian_cleaners_piper:
```
import piper_phonemize
def persian_cleaners_piper(text):
    """Pipeline for Persian text, including abbreviation expansion. + punctuation + stress"""
    #text = convert_to_ascii(text)
    text = lowercase(text)
    text = expand_abbreviations(text)
    phonemes = "".join(piper_phonemize.phonemize_espeak(text=text, voice="fa")[0])
    phonemes = collapse_whitespace(phonemes)
    
    # Remove unwanted symbols (e.g., '1')
    unwanted_symbols = {'1', '-'}  # Add any other unwanted symbols here
    filtered_phonemes = "".join([char for char in phonemes if char not in unwanted_symbols])
    
    return filtered_phonemes
```

4) In matcha/text/cleaners.py change this line to:
```
    intersperse(text_to_sequence(text, ["persian_cleaners_piper"])[0], 0),
```

5) Also set cleaner in configs/data/custom.yaml:
cleaners: [persian_cleaners_piper]

6) replace symbols.py by:
```
def read_tokens():
    tokens = []
    with open("/home/oem/Basir/TTS/Matcha/Matcha-TTS/configs/tokens/tokens_sherpa_with_fa.txt", "r", encoding="utf-8") as f:
        for line in f:
            # Remove the newline character at the end
            line = line.rstrip("\n")
            # Split into token and number, preserving whitespace
            if " " in line:
                token = line[:line.index(" ")]  # Extract everything before the first space
                if len(token) == 0: # White-space
                    token = ' '
            else:
                token = line  # If there's no space, the entire line is the token
            tokens.append(token)
    return tokens

symbols = read_tokens()
```
7) For possible errors, change save_figure_to_numpy to:
```
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import io

def save_figure_to_numpy(fig):
    buf = io.BytesIO()
    fig.savefig(buf, format='png', bbox_inches='tight', pad_inches=0)
    buf.seek(0)
    img = Image.open(buf)
    data = np.array(img)
    buf.close()
    
    return data
```

8) After exporting to onnx, add sherpa metadata if you want to use the model with sherpa
```
python3 ./add_sherpa_metadata_to_matcha.py
```

## Training results
![Training Results](khadijah-22050.png)

## Credits

Trained by Ali Mahmoudi (@mah92)

Special thanks to Masoud Azizi (@Mablue ), Amirreza Ramezani (@brightening-eyes ), and Dr. Hamid Jafari (Khaneh Noor Iranian Basir).

Special thanks to people from @ttsfarsi channel. 

I should also thank you @csukuangfj from Xiaomi corporation for your helps and cares in icefall and sherpa-onnx repos.

و ما نحن بشئ الا بما رحم ربنا