Silero Waveform Analyzer¶
An interactive visualization tool for inspecting Silero VAD probability logs. Use it to fine-tune VAD thresholds and diagnose speech detection behavior.
How to Use¶
-
Enable Probability Logging in Project Settings:
Project Settings > Plugins > Aboleth Speech-to-Text > Debug > Enable Probability Logging
-
Run your game and speak into the microphone. The plugin writes per-frame VAD data to:
-
Open the analyzer using the button above (or the embedded view below) and drag your CSV file onto it.
Embedded Analyzer¶
What You Can See¶
| Element | Description |
|---|---|
| Green waveform | Raw Silero speech probability (0.0 -- 1.0) over time |
| Green shaded regions | Periods where is_speech = true |
| Orange dashed line | Speech threshold -- probability above this triggers speech detection |
| Pink vertical lines | Speech onset and offset events |
| Blue dotted lines | Streaming inference passes |
| Teal pills (WORDS track) | Committed words positioned at their audio timestamps |
| Purple pills (FINAL track) | Final transcription results |
Interactive Controls¶
- Threshold -- Drag to visualize how different threshold values would affect speech detection
- Zoom -- Zoom into dense regions. Scroll horizontally when zoomed
Hover for Details
Hover over any point on the waveform to see exact timestamp, probability, and speech state. Hover over committed words in the WORDS track to see audio position, duration, and commit latency.