Building a private voice assistant on a Raspberry Pi 5

I wanted a voice assistant that didn't phone home for everything. The compromise: keep wake-word, STT, and TTS local; let an LLM handle the actual reasoning over a cheap API.

The pipeline

Stage	Tool	Latency
Wake word	OpenWakeWord	~100 ms
STT	faster-whisper (small.en)	600–800 ms
LLM	Groq (Llama 3.3 70B)	300–500 ms
TTS	Piper (lessac voice)	200–300 ms
Total		~1.2–1.8 s

Wake word, in 30 lines

from openwakeword.model import Model
 
model = Model(wakeword_models=["hey_eddy.tflite"])
 
def on_audio_chunk(chunk):
    pred = model.predict(chunk)
    if pred["hey_eddy"] > 0.55:
        trigger_listen()

The threshold matters. Too high and it misses you. Too low and your dishwasher wakes it up. 0.55 has been the sweet spot for me in a noisy room.

Barge-in is the killer feature

Letting the user interrupt the TTS mid-sentence is what makes it feel natural. The trick: keep the mic open during playback and watch the wake-word detector at a lower threshold. When it fires, kill audio output and start listening immediately.

What I'd do differently

Skip OpenWakeWord and use Picovoice Porcupine. The accuracy difference is real, even if it costs.
Cache common LLM responses locally with a 24h TTL. "What time is it" shouldn't hit the API.
Add a hardware mute switch. When friends are over, I want a one-touch off.

The Pi 5 with 8GB has enough headroom that you don't really feel the load. CPU usage hovers around 25% during a conversation.

Building a private voice assistant on a Raspberry Pi 5

The pipeline

Wake word, in 30 lines

Barge-in is the killer feature

What I'd do differently

Related Posts

I rewrote my Discord bot in Go — RAM dropped 95%