Skip to content

Per-user provisioning

Each family gets its own hermes container. This page explains how that container is born, configured, kept alive, and reaped. The logic lives inside sudo-api at cloud/api/provisioner.py — there is no separate provisioner service.

Why per-user containers at all

The agent needs to be stateful: persistent memory, ongoing sessions, scheduled crons, and per-family skills. Giving each family its own container with its own data volume keeps all of that isolated and durable, and lets the agent be proactive (it has a long-lived home to run crons in). The cost is orchestration — which is exactly what the provisioner handles.

The lifecycle

Per-user hermes lifecycle — trigger to healthy

1. Trigger

Anything that needs the agent calls ensure_runtime(user_id) first: a voice turn (from voice-bridge), a chat send, a WhatsApp inbound, or even just opening /integrations.

2. Spawn

If the container isn't running, _spawn() issues a docker run with:

  • A named volume hermes-user-<uuid-hex>-data mounted at /opt/data — this is where sessions, memories, and skills survive restarts.
  • A read-only bind-mount of our plugins directory at /opt/data/plugins.
  • An entrypoint override to /bin/sh -c <_SEED_AND_EXEC> — our boot script.

Container naming

Containers are named hermes-user-<32-char uuid hex> (uuid.UUID(user_id).hex, no dashes). Any hermes-user-usr_* containers on the VPS are orphans from an old deploy and are safe to docker rm -f.

3. Seed config on every boot

_SEED_AND_EXEC is the heart of "we don't fork hermes." On every start it:

  1. Writes a Sudo-managed /opt/data/config.yaml (overwriting user edits — we are intentionally opinionated). This enables our platforms: (api_server, twilio_whatsapp, sudo_chat, sudo_voice), the toolsets, and memory settings, with the LLM provider pulled from global_settings.llm_provider.
  2. Seeds /opt/data/memories/MEMORY.md with the account identity block.
  3. Heals root-owned .env files from older builds; purges the legacy Baileys WhatsApp dir.
  4. Exec's upstream's tini + entrypoint with gateway run.

Don't override ENTRYPOINT in the Dockerfile

Upstream's entrypoint activates the venv that puts hermes on $PATH. We override only at docker run time (--entrypoint /bin/sh) and then call upstream's entrypoint from the wrapper. Overriding it in the image breaks the agent.

4. Wait for healthy

ensure_runtime() blocks on a GET /health poll (default 120s) so callers only get the URL once the agent can actually serve.

Keeping it warm, and reaping it

Heartbeat keeps the agent warm

  • A paired, powered Pi sends a heartbeat ~every 30s, which touches last_active_at and fire-and-forget calls ensure_runtime. So in practice the agent stays warm whenever the home device is on.
  • The idle reaper stops containers idle for RUNTIME_IDLE_SEC (default 1800s). In practice this only bites when the box is off or unpaired.

Config changes are fleet-wide

When an admin saves new LLM credentials at /admin/settings, restart_all_runtimes() does a docker rm -f + respawn + warm-up probe for every active container, so the new key takes effect immediately across all families. LLM config is global and admin-only — there is no per-user key. See Auth model and Secrets.

Where to look in the code

Function Role
ensure_runtime() The entry point everything calls. Spawn-if-needed + wait-healthy.
_spawn() The actual docker run with volume, plugins, entrypoint override.
_SEED_AND_EXEC The boot script (a shell string) injected as the entrypoint.
api_key_for() Derives the per-user API_SERVER_KEY (HMAC, no storage).
evict_idle() The idle reaper.
restart_all_runtimes() Fleet-wide recreate on admin config save.

All in cloud/api/provisioner.py.