tools/speech_data_explorer/README.md · camenduru/NeMo at main

Speech Data Explorer

Dash-based tool for interactive exploration of ASR/TTS datasets.

Features:

dataset's statistics (alphabet, vocabulary, duration-based histograms)
navigation across dataset (sorting, filtering)
inspection of individual utterances (waveform, spectrogram, audio player)
errors' analysis (Word Error Rate, Character Error Rate, Word Match Rate, Mean Word Accuracy, diff)
comparison of two ASR models using interactive word-level accuracy plot

Please make sure that requirements are installed. Then run:

python data_explorer.py path_to_manifest.json

To compare word-level accuracy of two ASR models:

python data_explorer.py path_to_manifest.json -nc pred_text_{model_1_name} pred_text_{model_2_name}

JSON manifest file should contain the following fields:

Errors' analysis requires "pred_text" (ASR transcript) for all utterances. "Visual comparison requires two or more "pred_text_{model_name}" fields."

Any additional field will be parsed and displayed in 'Samples' tab.