Mastering Local Text-to-Speech in Home Assistant: Integrating Piper and Wyoming

NGC 224

#Home_Assistant
#TTS
#Wyoming
#Piper
#Local_Control
#Privacy
#Automation

6 days ago

5m read

While cloud-based text-to-speech (TTS) services like Google Cloud TTS or Amazon Polly are easy to use and offer high-quality voices, they come with inherent downsides: dependency on an internet connection, potential latency, privacy concerns (sending your text to a third party), and potential costs depending on usage.

For many Home Assistant users, the desire for a fully local, private, and reliable smart home extends to voice output. This is where local TTS engines shine. One prominent solution that has gained traction recently is Piper, a fast and lightweight neural text-to-speech system, integrated into Home Assistant via the innovative Wyoming protocol.

Why Choose Local TTS with Piper and Wyoming?

Privacy: Your text never leaves your local network.
Speed: Announcements are often faster as there's no round trip to the cloud.
Reliability: Works even if your internet connection is down.
Cost-Effective: No usage fees.
Customization: Access to various voices (though selection is more limited than commercial cloud services).

The Wyoming protocol is a relatively new standard specifically designed for connecting voice assistants and services (like TTS, STT, Wake Word detection) locally. Home Assistant's voice features are increasingly leveraging this protocol.

Setting Up Piper and Wyoming in Home Assistant

Integrating Piper typically involves two main steps: running the Piper service (which implements the Wyoming protocol) and configuring the Wyoming integration in Home Assistant to connect to it.

Step 1: Run the Piper Service (Wyoming Server)

There are several ways to run the Piper service:

Option A: Home Assistant Add-on (Recommended for Home Assistant OS/Supervised)

This is the easiest method if you're running a supervised version of Home Assistant (like Home Assistant OS, Home Assistant Supervised):

Navigate to Settings > Add-ons.
Click on the Add-on Store button (bottom right).
Search for 'Piper'.
Install the official 'Piper' add-on.
(Optional but Recommended) Go to the Configuration tab of the add-on. You can specify which voices to download. This saves disk space compared to downloading all voices. For example, to download a US English voice, add something like:!$0$!Check the add-on documentation for available voice names.
Go to the Info tab and Start the add-on.
Enable 'Watchdog' and 'Auto update' for reliability.

Option B: Docker Container

If you're running Home Assistant Container or want to run Piper on a separate machine:

Ensure you have Docker installed.
Pull the Piper image (often available from repositories like 'rhasspy/piper-wyoming'). The exact command might vary; check the latest documentation for the specific image you're using (e.g., !$1$!).
Run the container. You'll need to expose the Wyoming protocol port (typically 10200) and potentially mount a volume for voices if you don't bake them into your custom image. A basic run command might look like:!$2$!Refer to the container image's documentation for specifying voices during startup or configuration.

Option C: Manual Installation

For advanced users, you can install Piper and its dependencies manually on a compatible system and run the !$3$! or similar script as a service, ensuring it listens on a TCP socket using the Wyoming protocol on port 10200. This requires more technical expertise and service management.

Step 2: Configure the Wyoming Integration in Home Assistant

Once the Piper service is running and accessible on your network (default port 10200), you need to tell Home Assistant where to find it:

In Home Assistant, go to Settings > Devices & Services.
Click the '+ Add Integration' button.
Search for 'Wyoming'.
Select the 'Wyoming' integration.
Enter the IP address and port of the machine running the Piper service (e.g., !$4$! and !$5$! if using the official add-on running on the same Home Assistant instance, or !$6$! if running on another machine).
Click 'Submit'.
If the connection is successful, Home Assistant will discover the capabilities provided by the Piper service (in this case, TTS).
Click 'Finish'.

Home Assistant should now have a new TTS entity available, typically named something like !$7$! (or based on the add-on/service name).

Integration Tips and Usage

Now that Piper is integrated, you can use it throughout Home Assistant.

Using the TTS Service

The primary way to use TTS is via the !$8$! service. You can call this service from Developer Tools -> Services, or within automations and scripts.

!$9$!

Replace !$10$! with the actual entity ID of a media player device in your home (e.g., a Google Home speaker integrated via Google Cast, a Sonos speaker, a DLNA renderer, or even Home Assistant's built-in audio output if configured). The !$11$! parameter is used if your Piper instance supports multiple voices and you want to choose a specific one.

Setting a Default TTS Entity

You can set your Piper TTS entity as the default TTS provider in Home Assistant's configuration.yaml:

!$12$!

After setting this up and restarting Home Assistant, you can often omit the !$13$! in service calls, as Home Assistant will use the default.

Using Templating in Messages

As shown in the example above, you can use Home Assistant's powerful templating engine within your TTS messages to announce dynamic information like sensor readings, states of devices, or greetings based on presence.

Multiple Voices

If your Piper installation provides multiple voices (either via the add-on configuration or how you set up the Docker/manual install), you can specify the desired voice in the !$14$! part of the !$15$! service call.

Best Practices for a Reliable Local TTS System

Resource Management: Piper is generally efficient, but generating speech requires CPU resources, especially for longer messages or multiple simultaneous requests. Monitor the resource usage of the machine or add-on running Piper. If you experience delays or stutters, consider running Piper on more powerful hardware or optimizing your setup.
Network Reliability: Ensure the network connection between your Home Assistant instance and the machine running Piper is stable. Wired connections are preferable for reliability, especially if running on a separate device.
Voice Management: Only download and keep the voices you actually use. Voice files can be large and consume significant disk space.
Caching: Use the !$16$! option in your !$17$! service calls. Home Assistant will store the generated audio file and reuse it for identical messages, significantly reducing the load on the Piper service and speeding up repeated announcements (e.g., 'The front door is open').
Fallback (Advanced): For critical announcements, you might consider having a cloud TTS service configured as a secondary option and use automation logic to fall back to it if the local Piper service is unavailable (e.g., ping the Piper host before calling the service).
Alias the Integration: If you set up the Wyoming integration manually via YAML, you can give it a friendly alias using the !$18$! key. This can make your !$19$! service calls more readable (e.g., !$20$!).

Conclusion

Integrating Piper via the Wyoming protocol provides Home Assistant users with a robust, private, and fast local text-to-speech capability. Whether you're announcing the laundry cycle completion, notifying yourself when a door is left open, or integrating voice feedback into complex automations, local TTS ensures your smart home communicates effectively and reliably, without relying on external services. By following these steps and best practices, you can master local speech generation and elevate your Home Assistant automation game.

Written by:

NGC 224

Author bio:

There are no comments yet