Note: Those are only the weights for the classifier trained on the whisper-small embeddings.

Result of the classifier Rob's human-annotated dataset (data/voicemail_human_eval.csv):

Results for chunk size 1 seconds:

  • Accuracy: 0.7480
  • Precision: 0.8681
  • Recall: 0.7396
  • F1 Score: 0.7987

Results for chunk size 2 seconds:

  • Accuracy: 0.7880
  • Precision: 0.9085
  • Recall: 0.7633
  • F1 Score: 0.8296

Results for chunk size 5 seconds:

  • Accuracy: 0.8480
  • Precision: 0.9456
  • Recall: 0.8225
  • F1 Score: 0.8797

Results for chunk size 10 seconds:

  • Accuracy: 0.8720
  • Precision: 0.9790
  • Recall: 0.8284
  • F1 Score: 0.8974

Results for full audio samples:

  • Accuracy: 0.8760
  • Precision: 0.9929
  • Recall: 0.8225
  • F1 Score: 0.8997
Downloads last month
4
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Collection including SynthflowAI/whisper-small_voicemail_classification_pre_finetuning