Hi everyone,
I’ve been exploring the ESP32-S3 for offline voice control, and I wanted to share some info and see if anyone has insights or is already working on it.
Background
-
The ESP32-S3 chip has been around since 2021 and comes with AI/DSP acceleration, which makes it ideal for audio tasks like microphone input, keyword spotting, and simple speech processing.
-
ESPHome added experimental audio pipeline support starting mid-2023, and by ESPHome 2023.8 the
voice_assistant:component was introduced.
What voice_assistant: Does
-
Streams microphone input in real time from ESP32-S3 devices.
-
Uses ESPHome’s native API (protobuf) for efficient, low-latency audio streaming.
-
Currently designed only for Home Assistant, where the audio can be sent to any STT engine (Whisper, Piper, Google STT, etc.).
Limitation for OpenHAB
-
OpenHAB has no native support for ESPHome audio streaming.
-
That means the efficient streaming path from the ESP32-S3 is not accessible natively.
-
Workarounds today would require:
-
Sending audio via HTTP uploads (WAV/RAW clips) or MQTT (Base64), which is less efficient.
-
Implementing a bridge/gateway that understands the ESPHome audio protocol and forwards audio to OpenHAB/STT.
-
Developing a dedicated OpenHAB binding to support ESPHome’s protobuf audio stream.
-
Why This Is Exciting
-
ESP32-S3 + I²S microphone = a low-cost, distributed, offline voice control solution.
-
With wake-word detection running locally (e.g., microWakeWord), the device can wake up locally and stream audio only when needed.
-
OpenHAB could greatly benefit from the same efficiency and low-latency pipeline if someone implements support.
So my question to the community:
-
Has anyone started working on an OpenHAB binding or middleware for ESPHome audio?
-
Any ideas for a lightweight approach to integrate the ESP32-S3
voice_assistant:pipeline into OpenHAB?
Would love to hear thoughts, tips, or if someone has already experimented with this.
Thanks!
PS: Read this Creation an voice audio satellite with the help of an Esp32 - #11 by moe
But did not see any result…
PPS: Waveshare ESP32-S3 1.46 Inch Round Display Development Board, 412 x 412, Supports Wi-Fi & BLT, Accelerometer and Gyroscope Sensor, Onboard Speaker and Microphone, with Protective Cover Glass: Amazon.de: Computer & Accessories
Hardware is getting realy cheap (on Ali even 50% of that)