The three surfaces¶
Everything a user can do passes through one of three doorways. They look different on the outside but converge on the same per-user hermes agent. This page is the map; each surface has its own deep-dive under The surfaces.
Reactive vs proactive — the key distinction¶
Every surface works in two directions, and it's worth getting this straight early:
- Reactive — a user sends something; the agent replies. (You talk, it answers.)
- Proactive — the agent starts the conversation. A scheduled
cronjob(deliver=…)fires, or the agent decides tosend_message(target=…). (It pings you.)
Because hermes treats each surface as a first-class platform, the agent can choose which surface to speak out of. It might answer a WhatsApp question by WhatsApp, but deliver a morning reminder by voice on the Pi.
At a glance¶
| Voice | Chat | ||
|---|---|---|---|
| User device | Pi in the home | Web browser | Phone (WhatsApp app) |
| Transport in | WebRTC (LiveKit) → voice-bridge | POST /v1/me/chat/turn |
Twilio webhook → sudo-api |
| Transport out | TTS audio over LiveKit | SSE stream to browser | Twilio Messages REST |
| Reactive path hits | api_server :8642 (via voice-bridge) |
sudo_chat plugin :8652 |
twilio_whatsapp plugin :8651 |
| Proactive path | sudo_voice plugin → voice-bridge :18087 |
sudo_chat → SSE fan-out |
twilio_whatsapp → Twilio REST |
| Who authenticates the user | LiveKit token + device JWT | Supabase JWT | X-Twilio-Signature + phone lookup |
| Deep dive | Voice | Chat |
How each one reaches the agent (reactive)¶
The middle three steps are identical in spirit across surfaces; only the first and last "shapes" differ (audio vs SSE vs WhatsApp message). That's the whole trick — one brain, three skins.
Why plugins instead of forking hermes¶
hermes has a documented plugin path: drop code in /opt/data/plugins/ and the loader
discovers it. We bind-mount three plugins into every per-user container:
sudo_chat— the browser chat adapter.sudo_voice— the proactive-voice adapter (lets the agent speak unprompted).twilio_whatsapp— the WhatsApp adapter.
Because each is a real hermes platform, the agent gets native send_message(target=…)
and cronjob(deliver=…) for it — no special-casing in our code. See
Adding a platform plugin to see how this
extension point works.
Next: A request, end to end traces one voice turn through every hop with the real endpoints.