Skip to content

Silero Waveform Analyzer

An interactive visualization tool for inspecting Silero VAD probability logs. Use it to fine-tune VAD thresholds and diagnose speech detection behavior.


How to Use

  1. Enable Probability Logging in Project Settings:

    Project Settings > Plugins > Aboleth Speech-to-Text > Debug > Enable Probability Logging

  2. Run your game and speak into the microphone. The plugin writes per-frame VAD data to:

    Plugins/AbolethSTT/Tools/silero_probabilities.csv
    
  3. Open the analyzer using the button above (or the embedded view below) and drag your CSV file onto it.


Embedded Analyzer


What You Can See

Element Description
Green waveform Raw Silero speech probability (0.0 -- 1.0) over time
Green shaded regions Periods where is_speech = true
Orange dashed line Speech threshold -- probability above this triggers speech detection
Pink vertical lines Speech onset and offset events
Blue dotted lines Streaming inference passes
Teal pills (WORDS track) Committed words positioned at their audio timestamps
Purple pills (FINAL track) Final transcription results

Interactive Controls

  • Threshold -- Drag to visualize how different threshold values would affect speech detection
  • Zoom -- Zoom into dense regions. Scroll horizontally when zoomed

Hover for Details

Hover over any point on the waveform to see exact timestamp, probability, and speech state. Hover over committed words in the WORDS track to see audio position, duration, and commit latency.