ADR-0009: Channel-Agnostic Conversation Substrate + Channel Tier Model
Status: Accepted
Date: 2026-05-05
Supersedes: ROADMAP §"What this roadmap deliberately doesn't include" — the
"no web booking widget" exclusion (lines 70-71 of docs/ROADMAP.md as of
2026-05-05).
Context
By the close of M9+1 (2026-05-05), the Ratiba codebase already contained a de facto channel-agnostic substrate even though the architecture had never been formally codified that way:
- M5 landed the dispatcher (
app/orchestrator/dispatcher.py) with achannelparameter propagating end-to-end. - M7's voice work demonstrated empirically that adding a channel was "mostly I/O plumbing on top of the channel-agnostic substrate" — the M7 close-out memo's exact phrasing.
- AnswerShaper grew a
_render_for_voicepost-processor that strips markdown bullets, replaces URLs, and caps sentences — a channel-specific rendering rule riding on a single shared LLM prompt. app/agents/identity_resolver.pyexists as a stub anticipating cross-channel customer identity resolution.
Three independent product asks now force a formal architectural commitment:
- A web chat widget for tenants who want to capture leads on their
own websites. Initially a hosted standalone link (
widget.ratiba.chat/<tenant-slug>); eventually an embeddable script. - Instagram DM bot to handle service/price inquiries that today pile up in the tenant's social inbox unanswered.
- Facebook Messenger bot for the same reason on Meta's other page-messaging surface.
The ROADMAP (docs/ROADMAP.md) had previously excluded a web
widget on thesis grounds: "shipping a widget undermines [the AI-is-the-interface
thesis] and dilutes the differentiator." On re-examination, that
reasoning conflated two distinct things:
- A booking form on a website — yes, a thesis violation. Customers click dropdowns instead of conversing.
- A conversational widget that runs the same FSM as WhatsApp/voice — more thesis-aligned than a traditional booking form, because the customer still books by talking, just on a different surface.
The 2026 industry consensus on conversational AI (Salesforce, Zendesk, Intercom, Yellow.ai, Parloa) is that omnichannel ≠ multichannel:
- Multichannel = separate bots per platform (loses context, drift, duplicated logic). Bad.
- Omnichannel = one brain, many I/O surfaces, unified identity. This is what the Ratiba codebase is already doing, just informally.
The cost picture also matters. Per fully-handled conversation:
| Channel | Platform fees |
|---|---|
| Web widget | $0 (our infra only) |
| $0 in-window / ~$0.01–0.02 utility template out-of-window (Kenya) | |
| Instagram DM | Same in-window economics; approved tags only out-of-window |
| Messenger DM | Same as Instagram |
| Voice (LiveKit + Deepgram + ElevenLabs) | ~$0.10–0.20 / minute |
The web widget is the cheapest fully-bidirectional conversational
surface by an order of magnitude vs. voice and meaningfully cheaper
than out-of-window WhatsApp. This isn't just a feature; it's an
operational lever for cost-conscious East African SMB pilots
(per project_cost_discipline.md memory).
Decisions
D1. The substrate thesis (one-sentence commitment)
The conversation substrate is channel-agnostic; channels are I/O adapters with capability flags and identity-resolution semantics, not first-class branches in the FSM.
This formalises what M5/M7 already implemented. Concretely:
dispatch_inbound_message(channel: ChannelKind, ...)is the single entry point regardless of inbound surface.- The booking, cancel, and reschedule graphs from M5 run identically across all channels — only one new FSM node is added (D5).
- A new
Channelprotocol (D2) governs all I/O adapters. - Per-channel rendering rules (D6) are declarative, not branched.
Future contributors shall not add channel-specific branches to the dispatcher or FSM graphs. New channels add a row to the registry, an adapter class, and a rendering rule — nothing else.
D2. Channel protocol + ChannelKind enum
A new app/channels/_base/protocol.py module defines:
class ChannelKind(StrEnum):
WHATSAPP = "whatsapp"
VOICE = "voice"
WEB = "web"
INSTAGRAM = "instagram"
MESSENGER = "messenger"
class Tier(StrEnum):
TIER1 = "tier1" # phone known from message metadata
TIER2 = "tier2" # phone captured progressively (see D3)
class Channel(Protocol):
kind: ChannelKind
tier: Tier
identity_anchor: Literal["phone", "session"]
session_window_seconds: int | None # None = no window restriction
supports_payment_inline: bool
supports_rich_media: bool
async def inbound_handler(self, payload: dict, tenant_id: UUID) -> None: ...
async def outbound_send(self, customer_id: UUID, text: str) -> SendResult: ...
def is_within_session_window(self, thread: ConversationThread) -> bool: ...
A registry at app/channels/_base/registry.py maps ChannelKind → adapter
instance. Existing app.channels.whatsapp and app.voice are wrapped
in thin adapter classes (see D7) — zero behavioural change.
D3. Two-tier channel model
The Channel protocol's tier field is prescriptive, not descriptive:
| Capability | Tier-1 | Tier-2 |
|---|---|---|
| Identity anchor | phone (from message metadata) | session (cookie / page-scoped user ID) |
| Session window | None (always open) | 24h hard window (IG/Messenger Meta policy) or cookie-bounded (web) |
| Payment STK push | Direct — phone known turn 1 | Gated on phone capture via D5's COLLECT_PHONE |
| Outbound posture | Reactive + proactive | Reactive-only outside an active session window |
| Members | WHATSAPP, VOICE | WEB, INSTAGRAM, MESSENGER |
There is no Tier-3. A channel that doesn't fit one of these tiers does not ship. Tiering is per-channel-class, not per-tenant — a tenant cannot promote Instagram to Tier-1 by configuration because Meta's API constraints don't change with willingness to pay.
D4. Cross-channel identity model
Phone number is the only deterministic cross-channel join key.
Every other identifier (web cookie, IG PSID, Messenger PSID) is a tenant-scoped alias that binds to a customer once a phone is captured. Probabilistic matching (IP, fingerprint, behavioural) is explicitly prohibited — GDPR landmine, marginal accuracy, M-Pesa already requires phone for paying flows so phone-only is sufficient.
Schema (lives in tenant schema per ADR-0002, not public):
<tenant>.customers -- customer_id (UUID PK), phone_e164 NULLABLE,
name, dob NULLABLE, language_pref
<tenant>.customer_identities -- (customer_id, provider, external_id) — many-to-one
provider ∈ {'whatsapp','voice','web_cookie',
'instagram','messenger'}
<tenant>.customer_sessions -- anonymous landing sessions; promoted into
customers row at phone-capture time
The identity_resolver module at app/agents/identity_resolver.py
extends with bind_phone(session_id, phone_e164) which:
- Looks up
customers WHERE phone_e164 = :phone. - Hit: merges session into existing customer (preserves cookie/PSID
linkage in
customer_identities); prior conversation history becomes available to the agent as context. - Miss: upserts new customer, copies session metadata, marks session as promoted.
A returning Tier-2 visitor (same cookie) skips identity capture entirely
because the cookie already maps to a customer record via
customer_identities WHERE provider='web_cookie' AND external_id=cookie_id.
D5. COLLECT_PHONE FSM entry-state for Tier-2
A new node in booking_graph.py runs before COLLECT_SLOT, gated by:
needs_phone = channel.tier == TIER2 and customer.phone_e164 is None
Tier-1 channels skip it (phone known from metadata). Returning Tier-2 customers skip it (phone on file via cookie/PSID identity).
Failure modes:
- Invalid format (E.164 reject) → re-prompt up to 3 times in customer's language.
- Three-strike refusal or persistent invalid → graceful fallback to
the lead-capture branch: agent says "No problem — we'll continue
this on WhatsApp. Reach us at [tenant.whatsapp_display_number]" and
ends the FSM thread. The session is recorded in the tenant dashboard
with an
unverified=trueflag for manual follow-up.
LangGraph saver compatibility: adding COLLECT_PHONE is a graph-shape
change → bumps the FSM's __version__ constant → existing in-flight
threads with old state shapes get a one-time migration shim
(if state.version < 4 and channel.tier == TIER2: insert COLLECT_PHONE).
D6. AnswerShaper declarative rendering rules
M7's _render_for_voice post-processor generalises to
_render_for_channel(channel, raw_text). Rules are a declarative table
keyed on ChannelKind:
| Channel | Rules |
|---|---|
whatsapp | No-op (raw markdown — WhatsApp renders bold + italic natively) |
voice | Existing M7 rules: bullet-strip + URL-replace + 2-sentence-cap |
web | Pass-through markdown; React widget renders via react-markdown |
instagram | 1000-char hard cap (Meta limit); strip URL preview metadata; bullet → • newline |
messenger | Same as Instagram |
New channels add a row to the table. They do not add a function or branch in code.
D7. Existing channels are not relocated
app.channels.whatsapp and app.voice stay where they are. The Channel
protocol is implemented by thin adapter classes (WhatsAppChannel,
VoiceChannel) that delegate to the existing tested code. Zero
behavioural change to M4/M7 surfaces.
This is a deliberate cost-discipline choice: refactoring well-tested code for cosmetic uniformity is exactly the "premature abstraction" anti-pattern the methodology warns against.
D8. Phone validation: STK push as the oracle, not SMS OTP
Phone validation stack:
- E.164 format check via
phonenumberslibrary — free, catches typos. - M6 reservation lock (15-min Redis SETNX on slot) — free, anti-squat.
- STK push as authoritative validator — wrong phone → STK fails →
ADR-0007's
PAYMENT_FAILEDstate recovers the slot. Free for paying flows. - Tenant dashboard "unverified" flag for lead-capture-only sessions that never reach payment.
SMS OTP is explicitly not in the M10 default flow. Justification:
- Africa's Talking SMS in Kenya costs ~$0.003 per OTP send — small but per-booking, compounds at pilot scale.
- The M-Pesa STK push is a stronger validator than OTP for paying flows (OTP only proves SIM possession; STK proves M-Pesa account ownership and phone validity simultaneously).
- The threat model is typos and spam-bookings, not account takeover.
When SMS OTP becomes worth adding (defer to M11+):
- Tenants with no-show fees (dental clinics charging deposits).
- Lead-capture-only tenants who want pre-validated phone leads.
For both: opt-in per-tenant flag phone_otp_required BOOLEAN. Not in M10.
D9. Channel-switch primitive (bidirectional)
Tenants may opt into channel steering (per-tenant flag
channel_steering_enabled BOOLEAN DEFAULT FALSE).
Tier-2 → Tier-1 (lead-capture pivot): When a web/IG/Messenger
visitor abandons or refuses phone, the FSM emits a channel_switch
directive that renders as a one-tap link
(https://wa.me/{tenant.whatsapp_display_number}?text=Continuing+from+web).
The dispatcher pre-stages a single-use customer_sessions.handoff_token
(24h TTL) so the next inbound WhatsApp message carrying the token in its
first turn resumes the conversation context.
Tier-1 → Tier-2 (cost optimisation): The agent may elect to send a
deep-link reply (widget.ratiba.chat/{slug}?resume=<token>) when a
rich-media response would be awkward in WhatsApp text, OR when the 24h
WhatsApp window is about to close. The customer taps the link, lands on
the widget, conversation history loads server-side via the resume token.
Conversation thread semantics (re-affirms ADR-0003): channel-switch = same customer, fresh thread on the new channel, agent context-loaded with prior conversation as background. Not a single thread spanning channels — that breaks LangGraph's saver invariants.
D10. Outbound reminder routing rule
Reminders (day-before appointment, post-booking utility messages) follow this priority order:
def route_outbound_reminder(customer, message):
if customer.has_whatsapp and channel.is_within_session_window(customer.thread):
send_via_whatsapp(...) # FREE in-window
elif customer.has_phone:
send_via_sms(...) # ~$0.003/msg via Africa's Talking (Kenya)
else:
log.warning("no reachable channel"); skip
SMS is a NotificationSink, not a Channel. Outbound-only, doesn't
enter the FSM, lives at app/notifications/sms.py. Customers replying
to an SMS reminder don't hit the dispatcher (replies go nowhere or to
the tenant's manual SMS inbox if they care). This keeps the Channel
abstraction clean (channels = bidirectional conversational surfaces).
The Africa's Talking integration already exists for voice (M7's SIP bridge) — adding outbound SMS is a one-API-key extension, not a new vendor.
D11. Per-tenant credentials + project-level webhook secrets
Mirrors ADR-0008's WhatsApp pattern. New columns on public.tenants:
instagram_business_account_id TEXT(nullable until provisioned)instagram_access_token TEXT(encrypted at rest)instagram_enabled BOOLEAN DEFAULT FALSEfacebook_page_id TEXTfacebook_page_access_token TEXTmessenger_enabled BOOLEAN DEFAULT FALSEwidget_primary_color TEXT DEFAULT '#0F766E'widget_logo_url TEXT NULLABLEwidget_display_name TEXT NULLABLE(falls back totenants.name)channel_steering_enabled BOOLEAN DEFAULT FALSEphone_otp_required BOOLEAN DEFAULT FALSE(for D8 future-opt-in)reminder_lead_seconds INT DEFAULT 86400(24h before appointment)
Webhook authentication: two new project-level secrets alongside the
existing WHATSAPP_APP_SECRET:
INSTAGRAM_APP_SECRET— HMAC key for/api/v1/webhooks/instagram.MESSENGER_APP_SECRET— HMAC key for/api/v1/webhooks/messenger.
Separate apps yield cleaner Meta permission scopes during initial provisioning. They can be merged later if Meta App configuration permits; ADR-0009 does not preclude consolidation.
Per-tenant routing happens after signature validation, by reading
the Meta entity ID from the trusted payload (IG Business Account ID for
Instagram; Page ID for Messenger) and looking up the corresponding row
in public.tenants.
D12. Anti-abuse posture for the web widget
The widget is exposed without authentication (anonymous visitors must be able to read it before they're known). Risk surface + mitigations:
| Risk | Mitigation |
|---|---|
| DoS via WS connection spam | Per-IP connection rate limit: 10/min |
| Cost abuse via message spam (LLM tokens) | Per-session message rate limit: 30 turns/hour while anonymous, unlimited post-phone-capture; ADR-0005 per-tenant cost ceiling already enforced via contextvar |
| Fake-phone bookings | Structurally bounded — STK push goes to entered phone; wrong number = no payment = no booking + slot auto-released by M6 reservation lock |
| Cross-tenant injection | Routes are tenant-scoped from URL onwards (widget.ratiba.chat/{slug} and /api/v1/channels/web/ws/{slug}); cookie name is tenant-prefixed (rtb_widget_session_<slug>) |
Consequences
Positive
- Adding new channels in the future (TikTok DM if Meta-style API matures, Telegram, RCS) becomes a one-row-plus-adapter exercise rather than a dispatcher refactor.
- Tenants with web traffic get lead capture and full booking on the cheapest channel — no platform fees.
- Channel-switch primitive lets tenants steer customers between WhatsApp and web based on conversation needs (rich media, cost windows).
- The voice channel (most expensive per minute) gains a "continue on web" escape hatch for customers willing to switch surfaces.
- Cross-channel customer history (same person on IG today, WhatsApp tomorrow) gives the agent richer context without breaking thread isolation.
Negative
- New schema migrations are required — three new tenant tables + ~12 new
columns on
public.tenants. Wave 0 of M10 carries this work. - Two new Meta App secrets to manage in production deployment configuration.
- The widget's frontend route lives in
frontend/app/widget/[tenant_slug]/alongside the M9 admin dashboard — same Next.js app, same dev server. Frontend dev-server collision risk during M10 dispatch is low (different routes) but worth noting. - Cookie consent / privacy posture for the web widget needs a clear GDPR baseline (cookies are session-functional, not tracking — should qualify as "essential" under GDPR but the determination warrants a privacy ADR before pilot scale).
Reversibility
The substrate is reversible by design:
- Disabling a channel = setting
<channel>_enabled = FALSEper tenant or removing the channel from the registry globally. No FSM impact. - Removing the entire channel-agnostic refactor would require unwinding the Channel protocol and inlining channel-specific branches back into the dispatcher — but this would be reverting a codification, not an irreversible architectural commitment. The substrate already existed informally before this ADR.
Open questions deferred to future ADRs
- ADR-0010 (potential): Privacy and cookie consent posture. Cookies on the web widget are session-functional, not tracking, and should qualify as "essential" under GDPR — but this needs a clear ADR before pilot scale. Includes: cookie banner UX, retention policy, customer data export/deletion flows, cross-tenant data isolation in identity table.
- ADR-0011 (potential): Phone-OTP gating — when SMS OTP becomes worth the per-message cost (no-show-fee tenants, lead-capture-only tenants). Will codify the threat model + cost trade-off.
- ADR-0012 (potential): Widget embeddable script (Phase 2). Once
pilot validates the standalone-link Phase 1, the embeddable
<script>tag pattern (iframe injection,postMessagelifecycle, cross-domain cookie strategy) needs its own architectural commitment.
Implementation
Land in M10 — Channel-Agnostic Tier-2 Expansion. Plan at
docs/superpowers/plans/2026-05-05-m10-channel-agnostic-tier2-expansion.md.
12 tasks across 5 waves; Wave 2 = 4-way parallel (web + IG + Messenger
- SMS sink); estimated +80–100 backend tests + +15–25 frontend Vitest post-M10.
Pre-flight: M9+1 must be COMPLETE per STATE.md before Wave 0 dispatch. (Already satisfied as of 2026-05-05.)