---
{}
---
Note: Those are only the weights for the classifier trained on the `whisper-small` embeddings.

Result of the classifier Rob's human-annotated dataset (`data/voicemail_human_eval.csv`):

Results for chunk size 1 seconds:
 - Accuracy: 0.7480
 - Precision: 0.8681
 - Recall: 0.7396
 - F1 Score: 0.7987 
 
Results for chunk size 2 seconds:
 - Accuracy: 0.7880
 - Precision: 0.9085
 - Recall: 0.7633
 - F1 Score: 0.8296 
 
Results for chunk size 5 seconds:
 - Accuracy: 0.8480
 - Precision: 0.9456
 - Recall: 0.8225
 - F1 Score: 0.8797 
 
Results for chunk size 10 seconds:
 - Accuracy: 0.8720
 - Precision: 0.9790
 - Recall: 0.8284
 - F1 Score: 0.8974 
 
Results for full audio samples:
 - Accuracy: 0.8760
 - Precision: 0.9929
 - Recall: 0.8225
 - F1 Score: 0.8997