STT-CLI

Mantej Singh Dhanjal · Mantej-Singh.STT-CLI

Hybrid speech-to-text CLI tool with offline Whisper support for Windows

STT-CLI v2.0 is a Windows-only speech-to-text tool with hybrid engine support (Whisper offline + Google online). Designed for users on corporate laptops where Win+H is disabled by IT policies. Runs as a background system tray application with no visible window, using a global hotkey (double-tap Left Alt) to toggle recording. Perfect for use with Windows Terminal, PowerShell, Claude Code, and Gemini CLI. Key Features: - Hybrid Speech-to-Text: Whisper (offline) + Google Web Speech API (online) with switchable engines - Offline Mode: Works without internet using OpenAI Whisper (MIT licensed, privacy-focused) - Engine Toggle: System tray menu to switch between Auto-Detect, Whisper, and Google modes - About Menu: Version info and engine status at your fingertips - Auto-start on Windows Boot with persistent settings - Global hotkey activation (double-tap Left Alt) - Background operation with system tray icon - Balloon notifications for recording status and engine changes - Automatic CLI window detection (security feature) - No admin rights required - Portable executable - no installation needed - Works on corporate laptops with restricted permissions and offline environments

winget install --id Mantej-Singh.STT-CLI --exact --source winget

Latest 2.0.0

Release Notes

v2.0.0 - Hybrid Speech-to-Text with Whisper Integration (November 26, 2025)

πŸš€ Major New Features:

  • Offline Speech Recognition: OpenAI Whisper "tiny" model (MIT licensed) for fully offline operation
  • Hybrid Engine Architecture: Auto-switch between Google (online) and Whisper (offline) based on connectivity
  • Engine Selection Menu: System tray submenu to manually choose Auto-Detect, Whisper, or Google engines
  • About Menu: New "About" section showing version, author, current engine, and Whisper status
  • Batch Launchers: run.bat (normal mode) and run-debug.bat (troubleshooting mode) for easy startup

πŸ”§ Technical Implementation:

  • faster-whisper: 4x faster than vanilla Whisper with CTranslate2 backend
  • INT8 Quantization: CPU-optimized for speed (greedy decoding, beam_size=1)
  • Lazy Model Loading: Whisper loads on first use (~70MB one-time download)
  • Model Cache: Stored in %APPDATA%\stt-cli\models for persistence
  • Settings Persistence: Engine preference saved across sessions
  • Hidden Imports: PyInstaller bundles av, faster_whisper, numpy, ctranslate2

⚑ Performance:

  • Whisper Latency: ~1-2s transcription time (tiny model)
  • Google Latency: ~0.5s transcription time (faster but requires internet)
  • Model Size: 75MB (tiny model, one-time download)
  • Executable Size: ~150MB (includes all Whisper dependencies)

🎯 Use Cases:

  • Corporate Environments: Bypass Win+H restrictions with offline capability
  • Air-Gapped Networks: Work completely offline with Whisper mode
  • Privacy-Focused: Audio never leaves your machine in Whisper mode
  • Variable Connectivity: Auto-switch between Google (accurate) and Whisper (offline)

πŸ“š Documentation:

  • Comprehensive architecture documentation in /docs
  • Whisper model download and caching explained
  • Threading model and flow diagrams
  • Batch launcher usage guide

All v1.4.0 functionality remains intact:

  • Auto-start on Windows Boot
  • Double-tap Left Alt hotkey
  • System tray operation
  • Balloon notifications
  • CLI window detection
  • No admin rights required

Installer type: portable

Architecture Scope Download SHA256
x64 β€” Download c3f2051b6cb2c8d844b81a6cb9bcf618dfc3dd148d96aebc4f34b4db7035c005

Details

Homepage
https://github.com/Mantej-Singh/stt-cli
License
MIT
Publisher
Mantej Singh Dhanjal
Support
https://github.com/Mantej-Singh/stt-cli/issues
Copyright
Copyright (c) 2025 Mantej Singh Dhanjal
Moniker
stt-cli

Tags

speech-to-textvoice-recognitionclicommand-linewindowsaccessibilityproductivityvoice-controlspeech-recognitionterminalsystem-traystartupwhisperofflineopenaiprivacy

Older versions (2)

1.4.0
Architecture Scope Download SHA256
x64 β€” Download ea1a35e0694297750cba749a487f1b844b9bfc66149ef61d66d3ed3882b4d03d
1.3.1
Architecture Scope Download SHA256
x64 β€” Download aa32784a05f164e816b7ff4cbb3caf4da0f478b121ac7f75d3e10c384b593204