What native ESPHome unlocks
Tater now owns the full ESPHome voice experience directly: discovery, room-aware voice sessions, live device state, and on-device playback all run inside the main app instead of a downloadable core.
Tater Assistant
Source-backed docs
Built-in ESPHome device runtime inside Tater for VoicePE, Sat1, and future native devices, with satellites, live entities, logs, stats, and the full voice pipeline on the main app port.
Tater now owns the full ESPHome voice experience directly: discovery, room-aware voice sessions, live device state, and on-device playback all run inside the main app instead of a downloadable core.
46 current Verbas advertise direct support for this runtime.
ESPHome is configured through Settings -> ESPHome for Satellites, Settings, and Stats, while shared STT/TTS model choices live in Settings -> Models. The runtime is built into the main Tater app rather than a separate downloadable core.
These notes focus on the built-in ESPHome runtime, the live voice pipeline, shared speech backends, and the operator tools now living directly inside Tater.
ESPHome is no longer a shop core. It is part of the main Tater app and starts with Tater.
The live voice loop uses shared STT/TTS choices from the Models tab while keeping ESPHome-specific controls in one native screen.
The ESPHome screen now separates devices, settings, and stats so tuning is based on real behavior instead of guesswork.
Optional experimental toggles let operators test more aggressive voice behavior on hardware that can support it.
Returns the current Satellites, Settings, and Stats payload so the WebUI can render discovery state, device cards, voice metrics, and runtime controls.
Handles refresh, connect/disconnect, save/forget satellite actions, live log lifecycle, and direct entity-control actions from the ESPHome settings screen.
Returns selected speech backends, effective fallback state, model roots, discovery state, selector sessions, and availability of local STT/TTS backends.
Returns the live entity snapshot so verbas and operators can inspect sensors, buttons, numbers, switches, lights, and other exposed device entities.
Supports button, number, switch, select, text, and light-control actions so device-local flows can act directly on the speaking device.
Used for device-local playback flows such as announcements, generated audio, and other responses that should play on the speaking satellite itself.