Channel substrate
What it does
Customers reach Ratiba through five surfaces — WhatsApp messages, voice calls, the embedded web widget, Instagram DMs, and Messenger DMs — plus one outbound-only SMS notification sink for reminder fallback. The substrate's job is to make those five surfaces look the same to the conversation FSM.
The thesis ADR-0009 commits to: channels are I/O adapters with capability flags and identity-resolution semantics, not first-class branches in the FSM. There is exactly one booking graph, one cancel graph, one reschedule graph; the channel an inbound message arrived on is metadata on the state, never a branch in the LangGraph compile.
Each adapter declares five capability flags — tier, identity_anchor, session_window_seconds, supports_payment_inline, supports_rich_media — and the substrate routes per-channel runtime behaviour off those flags. Tier-1 (WhatsApp + voice) anchors identity on phone metadata that the channel itself supplies — the customer's phone is known the instant the webhook lands or the SIP leg picks up. Tier-2 (web + Instagram + Messenger) anchors identity on a session token (cookie or PSID) and captures phone progressively via a COLLECT_PHONE FSM entry-state, since none of those surfaces hand you a verified phone for free. SMS is not a channel — it's a one-way NotificationSink (Africa's Talking) used only for reminder fallback when a Tier-2 customer is outside their 24h session window. Same dispatcher, six surfaces, one FSM.
How it fits in the system
The dotted arrows are external-provider edges. The Meta Cloud API serves three of the five channels — WhatsApp, Instagram, and Messenger share a single inbound webhook envelope shape and a single Graph API outbound surface (per ADR-0008 for WhatsApp and ADR-0009 D9 for the IG/Messenger extension). LiveKit handles voice; Africa's Talking is the SMS sink only — never an inbound channel.
How it flows
The seam is the dispatch_inbound_message call: above it, every adapter is doing channel-specific I/O (HMAC verification, session cookies, SIP audio frames); below it, the FSM is uniform. The reply travels back the same way — the FSM emits plain text, and the adapter renders it per its channel's idiom (WhatsApp gets asterisk-bold, voice gets TTS audio, the web widget gets markdown).
Capability flags in detail
Every concrete adapter class sets five class-level flags. The substrate reads these at runtime — they control routing, identity lookup, and session-window enforcement without a single if channel == X branch in the FSM.
| Flag | Type | Meaning |
|---|---|---|
tier | Tier enum | TIER_1 (phone known from inbound metadata) or TIER_2 (phone captured later) |
identity_anchor | str | The credential the resolver uses for identity lookup: "phone" (Tier-1) or "session_cookie" / "psid" (Tier-2) |
session_window_seconds | int or None | If set, the adapter enforces a session window; None means no window (WhatsApp, voice) |
supports_payment_inline | bool | Whether an M-Pesa STK push can fire while the customer is on this channel (True for WhatsApp + voice; False for Tier-2 channels that lack verified phone) |
supports_rich_media | bool | Whether the channel can render images, buttons, or card payloads beyond plain text |
The dispatcher reads supports_payment_inline before initiating an STK push. If a Tier-2 customer has not yet completed COLLECT_PHONE, customer.phone_e164 is NULL and the payment flag is False regardless — payment is deferred to after phone verification. See Payments for the full payment gate logic.
Tier-1 channels
WhatsApp and voice are Tier-1: the customer's E.164 phone number arrives in inbound provider metadata before the first FSM turn starts. For WhatsApp, messages[0].from in the Meta Cloud API payload carries the sender's phone. For voice, the LiveKit SIP leg's caller-ID is resolved via public.tenants.did_phone_number before the TenantContext ContextVar is set. No COLLECT_PHONE gate is needed on either surface.
For voice specifically — full-duplex barge-in, end-of-turn detection, WPM-adaptive speed — see Voice conversation. The voice adapter integrates via the same dispatch_inbound_message seam; the FSM has no awareness of audio.
WhatsApp uses Meta's first-party Cloud API directly, with no BSP intermediary (ADR-0008). This saves €49/month versus a BSP like 360dialog. Per-tenant credentials (whatsapp_phone_number_id, whatsapp_access_token, whatsapp_business_account_id) live on public.tenants; a single project-level WHATSAPP_APP_SECRET handles HMAC verification across all tenants so webhook forgery protection is centralised. Instagram and Messenger share the same Meta Cloud API webhook envelope shape but use separate app secrets (INSTAGRAM_APP_SECRET, MESSENGER_APP_SECRET) — three surfaces, one inbound parsing path.
Tier-2 channels and the session window
Web widget, Instagram DM, and Messenger DM are Tier-2: the customer arrives as an anonymous session. On first contact the resolver creates a customer_sessions row with customer_id = NULL. The FSM routes through COLLECT_PHONE to bind the phone and merge the session into a real customer record. Full phone-binding mechanics are owned by Identity and tenancy.
Instagram DM and Messenger DM carry a 24-hour session window enforced by Meta's platform: outside that window, only approved template messages can be sent outbound. The adapter's is_within_session_window() method checks session_window_seconds against last_inbound_at; the dispatcher consults this before sending a non-template reply. When a Tier-2 customer is outside the window and a reminder needs to go out, the NotificationSink (SMS via Africa's Talking) is the fallback path.
The web widget has no Meta-imposed session window (session_window_seconds = None), but Tier-2 phone capture still applies to it.
For operators: if a customer on Instagram DM stops getting responses after a day of inactivity, the session window is the cause. The channel_session_expired structured log event records every window expiry — tail it from Observability to diagnose. Re-engagement requires the customer to send a new inbound message first (Meta's rule, not Ratiba's).
NotificationSink — SMS reminder fallback
SMS (Africa's Talking) is explicitly not a channel — it carries no inbound path, no ChannelKind enum value, and no adapter class. It is a NotificationSink: a one-way outbound surface the FSM can call after a booking is confirmed to send a reminder to a Tier-2 customer who is outside their 24h session window.
The cost per reminder is approximately $0.003 via Africa's Talking (compared to ~$0.05 for an out-of-window WhatsApp utility template in Kenya) — the cheaper of the two fallbacks. The routing rule is: try a free WhatsApp session message first; if the window is closed, fall to SMS. The sink fires from app/channels/_base/notification_sink.py; it shares no code path with any channel adapter.
Channel-switch tokens
A channel-switch token lets a customer who started on the web widget resume the same conversation on WhatsApp, without losing booking state. The token is a single-use random string with a 24-hour TTL stored in Redis under channel_switch:<token>. When the customer opens the WhatsApp link, the adapter redeems the token, looks up the original thread_id, and resumes the LangGraph checkpoint seamlessly from the FSM's perspective.
Channel-switch is opt-in per tenant — not enabled by default in M13. Tenants that offer it typically present a "Continue on WhatsApp" button in the web widget UI. The token minting and redemption paths live in app/channels/_base/switch.py.
Where it lives in code
| Concern | File | Key entry point |
|---|---|---|
| Channel protocol + ChannelKind + Tier | backend/app/channels/_base/protocol.py:60 | Channel.inbound_handler, Channel.outbound_send, Channel.is_within_session_window |
| Adapter registry | backend/app/channels/_base/registry.py:22 | register_channel, get_channel |
| NotificationSink | backend/app/channels/_base/notification_sink.py | SmsSink.send_reminder |
| Channel-switch tokens | backend/app/channels/_base/switch.py | mint_switch_token, redeem_switch_token |
| WhatsApp adapter | backend/app/channels/whatsapp/__init__.py | webhook at app/api/webhooks/whatsapp.py:191 (receive_webhook) |
| Voice adapter | backend/app/channels/voice/__init__.py:23 | VoiceChannel + LiveKit AgentSession in app/voice/agent.py |
| Web widget adapter | backend/app/channels/web/_channel.py:46 | WebChannel + WS route at app/api/channels/web.py |
| Instagram adapter | backend/app/channels/instagram/_channel.py:47 | InstagramChannel + webhook at app/api/webhooks/instagram.py |
| Messenger adapter | backend/app/channels/messenger/_channel.py | MessengerChannel + webhook at app/api/webhooks/messenger.py |
| Top-level dispatcher | backend/app/orchestrator/dispatcher.py:847 | dispatch_inbound_message |
The protocol module is import-free of any concrete channel — adapters depend on the substrate, never the reverse. New adapters get added by writing a class with the five capability flags + three Protocol methods, then calling register_channel(...) at module-import time; the registry's idempotency contract makes duplicate imports safe.
Decisions
- ADR-0009 — Channel-agnostic conversation substrate + two-tier channel model.
- ADR-0008 — WhatsApp Cloud API direct (Meta first-party) — supersedes the 360dialog BSP entry; reversibility preserved by schema.
Related
- Voice conversation — full-duplex voice channel details: barge-in, endpointing, WPM-adaptive speed, the
VoiceStreamEventseam - Identity and tenancy — phone-binding mechanics,
COLLECT_PHONEFSM state, cross-channel identity merge - Conversation FSM — the FSM the substrate dispatches into;
ChannelKindis metadata onBookingState - Payments —
supports_payment_inlineflag and the Tier-2 phone-gate before STK push - Observability —
channel_session_expiredevents,channel=ChannelKind.*log lines - Configuration — per-tenant channel credentials,
WHATSAPP_APP_SECRET,INSTAGRAM_APP_SECRET
Try this on local dev
-
POST a sample WhatsApp webhook. With
docker compose up -drunning, send a Meta-shape inbound envelope at the local backend and watch the channel adapter parse it. Themessages[0].text.bodyfield is what the dispatcher receives:curl -i -X POST http://localhost:8010/webhooks/whatsapp \-H "Content-Type: application/json" \-H "X-Hub-Signature-256: sha256=<computed-hmac>" \-d @backend/tests/api/fixtures/sample_whatsapp_inbound.jsonThe fixture-builder lives in
backend/tests/api/test_whatsapp_webhook.py(search for_text_envelope); copy that JSON shape and replacephone_number_idwith the dev tenant's value. -
Open the web widget. Visit
http://localhost:3010/widget?tenant=<slug>in a browser. The widget mints acustomer_sessionsrow server-side, opens a WebSocket to the backend, and theWebChanneladapter takes over. First-turn identity is anchored on the cookie session; phone gets bound when the FSM entersCOLLECT_PHONE. -
Tail the uvicorn log to see ChannelKind dispatch. Open a third terminal and watch the dispatcher's per-turn log line — every inbound message logs which
ChannelKindit came from before the FSM is invoked:tail -f backend.uvicorn.log | grep -i "channel="You should see
channel=ChannelKind.WHATSAPPfor the curl above andchannel=ChannelKind.WEBfor the widget — same dispatcher, two surfaces, one FSM.