Piper: Stop Paying for Text-to-Speech (It’s Free Now)

Here’s the thing: if you’re still sending your homelab’s voice requests to Google, Amazon, or Azure, you’re leaving money on the table and compromising privacy for zero reason. Piper exists. It’s free, runs on CPU alone, and produces speech so natural you’ll stop noticing it’s synthetic. I’ve had it running for six months and I haven’t looked back.

🎯 Not sure if this will run on your hardware?Use our free Local LLM Hardware Checker — pick your GPU and RAM, see which models will run with real tokens/sec estimates.

Check my hardware →

Piper is a neural text-to-speech engine from the Rhasspy project that generates convincing voice output entirely offline. It powers Home Assistant’s local voice responses. The killer feature? It’s fast enough for real-time interactions, supports 20+ languages with multiple voice models per language, and you control it completely. No cloud dependency, no API quotas, no surprise bills.

Why Piper Beats Every Cloud Alternative (and It Costs Nothing)

Let’s be honest: Google Cloud TTS costs money. Azure TTS costs money. Even the “free” tiers have limitations that bite you the moment your homelab actually does something useful. Amazon Polly? $4 per million characters. That adds up fast when you’re automating voice announcements.

Piper? Zero cost. Zero per-request charges. Zero cloud vendor lock-in. You download it once, run it on hardware you already own, and that’s it. The speech quality is legitimately good—not robotic, not uncanny. I compared it side-by-side with Google Cloud TTS and honestly couldn’t justify the $50/month subscription afterward.

The latency is stupid fast too. We’re talking sub-second response times on modest CPU hardware. That matters if you’re building interactive voice interfaces, alert announcements, or voice confirmations in Home Assistant. Cloud APIs add network latency on top of processing time. Piper doesn’t.

The real win: Your voice data never leaves your network. That matters if you care about privacy, compliance, or just not handing speech logs to Alphabet.

The Install (It’s Stupidly Easy)

You’ve got three paths: Docker (obviously the right choice), bare metal Python, or just let Home Assistant handle it. I’ll show you the Docker way because it’s clean and portable.

Grab this docker-compose.yml:

version: '3.8'
services:
  piper:
    image: rhasspy/piper:latest
    container_name: piper-tts
    ports:
      - "10200:10200"
    volumes:
      - ./piper-cache:/home/piper/.cache
    environment:
      - PIPER_MODEL=en_US-lessac-medium
    command: |
      piper-server
      --voice en_US-lessac-medium
      --cuda
    restart: unless-stopped

Drop that in a directory, run docker-compose up -d, and you’re done. Piper will download the voice model on first run (takes maybe 30 seconds depending on your connection). By default it listens on port 10200.

The en_US-lessac-medium model is solid for English. Want a different voice? Piper has options. Check the GitHub repo for the full voice list—there’s everything from British English to Spanish, French, German, Japanese, and more.

The gear I run for this

Hardware from my own homelab, relevant to this guide — direct Amazon links.

Raspberry Pi 5 (8GB)The ultimate homelab starter. Run Pi-hole, Home Assistant, lightweight AI, and Docker containers.

~AED 370

NVIDIA RTX 3060 (12GB)The sweet spot for local AI. 12GB VRAM runs Stable Diffusion, Ollama 13B models, and Whisper comfortably.

~AED 1,300

APC Back-UPS Pro 15001500VA UPS with AVR. Keeps your homelab, NAS, and router online through power cuts.

~AED 750

Affiliate links — I earn a small commission at no extra cost to you. Browse my full homelab store →

Note: That --cuda flag assumes you have GPU support. If you’re running on a regular CPU-only setup (which is totally fine), just remove that line. Piper handles CPU gracefully.

Testing Your Setup (Two Minutes)

Once the container is running, test it with curl:

curl -X POST 
  -H "Content-Type: application/json" 
  -d '{"text": "Your homelab is now speaking"}' 
  http://localhost:10200/api/tts 
  --output test.wav

Play test.wav and verify it sounds good. If you hear speech, you’re golden. If not, check container logs with docker logs piper-tts.

Pro tip: The API also supports SSML markup if you want to control speech rate, pitch, or emphasis. Not required for basic use, but handy once you get comfortable.

Integrating With Home Assistant (Where Piper Shines)

Home Assistant has native Piper support. Add this to your configuration.yaml:

tts:
  - platform: piper
    language: en_US
    voice: lessac

Restart Home Assistant and you now have a local TTS service. Use it in automations:

service: tts.piper_say
data:
  entity_id: media_player.living_room
  message: "Motion detected at the front door"

That’s it. Your security automation just got a voice. No cloud, no latency excuses, no API limits. This is where Piper becomes genuinely useful in a real homelab.

You can also pipe Piper output through other services. Pair it with Home Assistant’s automations and suddenly you’ve got a smart home that actually talks to you about what’s happening. Combine with Node-RED for more complex flows.

Advanced Moves (Optional But Cool)

Want to get fancy? Piper supports multiple voices simultaneously if you run multiple instances on different ports. Useful if you want different voices for different alerts.

You can also run Piper behind Traefik for reverse proxy access if you’re already using it in your homelab. Just add the labels and route it through your local DNS. Means you can call Piper from any device on your network without exposing ports directly.

The real flex? Combine Piper with a local LLM (via Ollama or LM Studio) for a fully autonomous voice assistant. Generate responses locally, synthesize speech locally, zero cloud dependency. That’s a homelab power move.

Performance note: On a modern CPU (even a Raspberry Pi 4), Piper generates audio faster than real-time. We’re talking 10+ seconds of speech generated in under a second. CPU usage is negligible. Run this on your existing homelab hardware without worry.

The Real Talk

Piper isn’t perfect. The voice models aren’t as sophisticated as Google’s neural voices (which cost money because they’re trained on massive datasets). If you absolutely need hyper-realistic speech, cloud services still win. But for 99% of homelab use cases—alerts, announcements, voice confirmations—Piper is indistinguishable from premium TTS and costs nothing.

It’s also worth noting that Piper is maintained by the Rhasspy community, not a major corporation with venture funding pressure. The project is stable, well-documented, and actually solves the problem instead of trying to upsell you on enterprise features.

If you’re running Home Assistant, you should absolutely be using Piper. If you’re building any kind of voice interface in your homelab, Piper belongs in your stack. Stop paying cloud vendors for something your own hardware can handle. Grab it, run it, integrate it, and never think about TTS again.

Explore Piper in our AI Homelab Toolkit.