Architecture overview¶
This is the real topology โ the 60-second model with the actual service
names, ports, and arrows filled in. The canonical version lives in ARCHITECTURE.md in
the repo root; this page is the annotated tour.
The one big idea¶
Sudo terminates everything at one VPS, and spawns a separate agent container for each
family account on demand. There is no shared agent. sudo-api is both the web server
and the thing that runs docker run to create those per-user containers โ it talks to
the host's docker socket directly. (Older docs mention a separate sudo-provisioner
service; that was inlined into sudo-api during the pivot.)
Full topology¶
Legend: ๐ฆ long-lived container ยท ๐ง per-user container ยท ๐ฉ native process ยท ๐ช external service.
The services, one line each¶
Long-lived (declared in compose.prod.yaml):
| Service | Image | What it does |
|---|---|---|
sudo-api |
build: cloud/api/Dockerfile |
The website, the /v1/* API, and the inline provisioner that spawns per-user hermes via the docker socket. |
livekit-server |
livekit/livekit-server |
WebRTC media server. One room_<user_id> per user; Pi + voice-bridge join the same room. |
voice-bridge |
build: cloud/voice_bridge/Dockerfile |
livekit-agents worker: STT โ hermes โ TTS. |
grafana / loki / promtail |
off-the-shelf | Dashboards, log storage, log shipping. See Observability. |
Per-user (spawned at runtime):
| Service | Image | What it does |
|---|---|---|
hermes-user-<uuid-hex> |
nousresearch/hermes-agent:v2026.5.7 (retagged) |
The agent for one family. Runs gateway run, exposing four platform ports. Spawned on first use, kept warm by the Pi's heartbeat, reaped when idle. |
Native (not containers):
| Process | Where | Why native |
|---|---|---|
caddy |
VPS host | TLS + host-network reverse proxy; simpler as a system service. |
sudoedge |
the Pi | Needs direct mic/speaker access. |
Why it's shaped this way¶
A few deliberate choices that surprise people:
- One container per family, not per request. Gives each family isolated memory, sessions, and skills on its own volume โ and lets the agent be stateful and proactive.
- We don't fork hermes. The image is upstream, unmodified. All our customization is (a) three bind-mounted plugins and (b) a boot script that writes config. This keeps us on the upstream upgrade path. See Per-user provisioning.
sudo-apiis the provisioner. No separate orchestration service; it just has the docker socket mounted. Simpler to operate, one fewer moving part.- Caddy is on the host, not in compose. It owns
:443and TLS certs and is managed by systemd. It's deployed by a separate, tightly-scoped path. See Deploy.
Next: the three surfaces explains how voice, chat, and WhatsApp each reach the agent.