Skip to content

Welcome to Sudo

Sudo is a voice-first home assistant for families. A small device (a Raspberry Pi) sits in someone's home; the people there — including kids — talk to it, message it on WhatsApp, or chat with it in a browser. Behind all three of those doorways is the same AI agent, with memory, that can take real actions like controlling the home.

This wiki is for anyone who just joined and wants to understand how the whole thing fits together — not just one corner of it. You do not need to have seen the code yet.

If you read nothing else, read this page.

Everything below is the 60-second mental model. The rest of the wiki zooms in on each box. Come back here whenever you feel lost.

The 60-second mental model

There are exactly three ways to reach the assistant ("surfaces"), and they all funnel into the same brain (a per-user AI agent called hermes), which talks to a large language model to think.

Three surfaces converge on one per-user hermes agent

The clever part: all three surfaces share one agent and one memory per family account. Ask something by voice at home, then later ask a follow-up on WhatsApp from the office — it remembers, because it is literally the same agent.

Who runs where

Two physical places matter:

  • The home — a Raspberry Pi runs a small program we wrote called sudoedge. It captures the microphone, plays audio back, and handles first-time setup. That's it; the thinking does not happen here.
  • The cloud — a single VPS at sudohomes.com runs everything else: the web API, the voice plumbing, and a separate AI-agent container per family.

sudoedge runs in the home; everything else runs on the VPS

How the work actually flows

For a voice turn: the Pi streams your microphone to the cloud, voice-bridge turns speech into text, hands the text to hermes (your family's agent), gets a reply, turns that reply back into speech, and streams it to the Pi's speaker. Chat and WhatsApp do the same dance without the speech steps.

The agent doesn't just reply — it can remember, schedule things for later (cron), reach back out to you proactively (e.g. ping you on WhatsApp), and call tools (like controlling your smart home).