Spaces:

argmaxinc
/

whisperkit-android-benchmarks

Running

ardaatahan commited on 25 days ago

Commit

5753a9f

1 Parent(s): 79fc12a

update performance text

Files changed (1) hide show

constants.py CHANGED Viewed

@@ -77,8 +77,8 @@ PERFORMANCE_TEXT = dedent(
     - **Speed factor** (⬆️): Computed as the ratio of input audio length to end-to-end WhisperKit Android latency for transcribing that audio. A speed factor of N means N seconds of input audio was transcribed in 1 second.
     - **Tok/s (Tokens per second)** (⬆️): Total number of text decoder forward passes divided by the end-to-end processing time.
     ## Data
-   - **Short-form**: 5 hours of English audiobook clips with 30s/clip comprising the [librispeech test set](https://huggingface.co/datasets/argmaxinc/librispeech).
-    - **Long-form**: 12 hours of earnings call recordings with ~1hr/clip in English with various accents. Built by randomly selecting 10% of the [earnings22 test set](https://huggingface.co/datasets/argmaxinc/earnings22-12hours).
 """
 )

     - **Speed factor** (⬆️): Computed as the ratio of input audio length to end-to-end WhisperKit Android latency for transcribing that audio. A speed factor of N means N seconds of input audio was transcribed in 1 second.
     - **Tok/s (Tokens per second)** (⬆️): Total number of text decoder forward passes divided by the end-to-end processing time.
     ## Data
+   - **Short-form**: 10 minutes of English audiobook clips with 30s/clip comprising the [librispeech test set](https://huggingface.co/datasets/argmaxinc/librispeech).
+    - **Long-form**: 10 minutes of earnings call recordings in English with various accents. Built from the [earnings22 test set](https://huggingface.co/datasets/argmaxinc/earnings22-12hours).
 """
 )