Glossary
Ratiba's docs are DRY — each concept is explained in full in exactly one place. This glossary defines every term and links you to that one canonical explanation. If you find a concept explained twice, the second explanation is a bug: fix it with a one-line gloss + a deep link.
A–Z definitions
Admin orchestrator
The shallow 4-state FSM (IDLE ↔ ENGAGED) through
which a tenant owner reclaims a customer conversation when the AI loses confidence;
exposes 11 slash commands and a natural-language fallback.
Explained in full: Admin orchestrator
Admin slash commands
11 commands the tenant owner can send via dashboard or
WhatsApp to view today's bookings, cancel/reschedule, hand back to the customer,
or mutate the catalog (/add-service, /update-price, etc.).
Explained in full: Admin orchestrator
Adjacency window
The ±15-minute window around primary.end_at within which
a cross-sell offer is eligible; only post-primary slots are offered (not pre-primary).
Explained in full: Cross-sell
Alembic (per-tenant)
Each tenant schema is migrated independently via its own
Alembic invocation, so tenant migrations are isolated from the public registry
migrations and from each other.
Explained in full: Identity and tenancy
Answer shaper / answer_shaper
The service (app/services/answer_shaper.py)
that splices per-tenant personality directives and knowledge snippets into the LLM
user template before the LLMRouter fires; the system message stays byte-identical
across tenants to preserve Anthropic prompt-cache eligibility.
Explained in full: Personality dials
Anthropic Vision OCR
The image/PDF extraction path in the catalog-onboarding
pipeline; Anthropic Vision is called via the LLMRouter vision role to read
service menu images and PDFs into structured ExtractedRow objects.
Explained in full: Catalog onboarding
Backchannel
Short acoustic acknowledgements ("mmh", "sawa") the voice agent emits while the customer is still speaking, signalling active listening without interrupting the turn. Explained in full: Voice conversation
Barge-in
The ability for the customer to speak and interrupt TTS playback mid-sentence; the voice agent detects the interruption and yields the floor immediately without waiting for the audio to finish. Explained in full: Voice conversation
BookingState
The Pydantic v2 state shape (extra="forbid") that flows through
every LangGraph node in the booking FSM; versioned via FSM_GRAPH_VERSION and
back-filled by a migration shim on hydration.
Explained in full: Conversation FSM
Capability flags
Five per-channel flags (tier, identity_anchor,
session_window_seconds, supports_payment_inline, supports_rich_media) declared
by each channel adapter; the substrate routes runtime behaviour off these flags
without branching the FSM.
Explained in full: Channel substrate
Catalog onboarding
The pipeline that lets tenants self-onboard their service catalog from an image, PDF, CSV, or text; includes Vision OCR, LLM gap-fill, relation inference, a human review step, and an atomic idempotent commit. Explained in full: Catalog onboarding
Channel adapter
A concrete implementation of the Channel protocol that handles
one surface (WhatsApp, voice, web widget, Instagram DM, or Messenger DM); translates
channel-specific I/O into the uniform dispatch_inbound_message seam.
Explained in full: Channel substrate
Channel substrate
The Channel protocol + ChannelKind enum + Tier enum at
app/channels/_base/; makes the five customer-facing surfaces look identical to the
conversation FSM.
Explained in full: Channel substrate
Channel-switch token
A single-use 24-hour token that lets a customer resume a conversation on a different channel (e.g. web widget → WhatsApp); opt-in per tenant. Explained in full: Channel substrate
Checkpoint
A durable LangGraph state snapshot persisted to
checkpoints_<slug>.checkpoints in Postgres; the warm tier of the two-tier FSM
persistence (Redis = hot, Postgres = warm).
Explained in full: Conversation FSM
COLLECT_PHONE
The FSM entry-state that Tier-2 channel sessions (web, Instagram, Messenger) pass through on first contact; the agent asks for the customer's phone, binds it, and merges the session into any existing customer record. Explained in full: Identity and tenancy
Configuration / env vars
The full set of environment variables and Settings
Pydantic model that control every tunable in Ratiba (ports, credentials, feature
flags, per-tenant thresholds).
Explained in full: Configuration
Cost ceiling
The $0.05 soft / $0.20 hard per-booking LLM spend cap (per-tenant
configurable), tracked via total_token_cost_usd on BookingState; one of three
signals that can trigger a handoff.
Explained in full: Conversation FSM
Cross-channel identity merge
When a customer who previously used WhatsApp later
opens the web widget and binds the same phone, the resolver updates the session to
reference the existing customer_id — no duplicate customer record is created.
Explained in full: Identity and tenancy
Cross-sell
A slot-aware offer of a complementary service immediately after the
primary booking is confirmed; fires only when an adjacent post-primary slot exists
within ±15 minutes and the tenant's cross_sell dial is not never.
Explained in full: Cross-sell
current_tenant ContextVar
The asyncio ContextVar at
app/tenancy/context.py::current_tenant that propagates tenant identity through
async call stacks; get_tenant_session() raises loudly if unset, never silently
falling back to public.
Explained in full: Identity and tenancy
Daily 3 AM EAT reaper
A scheduled worker that sweeps three tables every day:
expired rows in public.payment_routing, rows older than 90 days in
<tenant>.checkpoints_archive, and rows older than 90 days in
<tenant>.handoff_log_archive.
Explained in full: Payments
Daraja / Safaricom Daraja API
Safaricom's REST API for M-Pesa integrations;
Ratiba uses it for STK push (customer-initiated payment prompt) and one-shot
stkpushquery reconciliation at t=60s.
Explained in full: Payments
DeepEval
The primary eval runner for Ratiba's AI quality gate; runs
LLM-as-judge metrics against transcript replays using a 4-tuple cache key
(scenario_id, prompt_version, judge_model_version, metric_version).
Explained in full: Testing
DeepEval 4-tuple cache key
The cache key
(scenario_id, prompt_version, judge_model_version, metric_version) used to avoid
re-running expensive LLM-judge evaluations when neither the prompt nor the judge
model has changed.
Explained in full: Testing
Deepgram Nova-3 (STT)
The streaming speech-to-text provider used by the voice channel; supports Swahili and English language detection; emits interim and final transcripts consumed by the end-of-turn detector and the barge-in / backchannel primitives. Explained in full: Voice conversation
Dial audit
A row written to <tenant>.dial_audits on every personality-dial
change, recording dial, before JSONB, and after JSONB; supports support-tooling
queries and rollback.
Explained in full: Personality dials
DialBundle
A frozen Pydantic dataclass carrying the resolved 6-dial values for a
tenant turn; to_template_vars() renders it into the {personality_directive} slot
variables that the answer shaper splices into the user template.
Explained in full: Personality dials
Dual-channel admin rail
The two surfaces through which a tenant owner operates
the admin orchestrator: a dashboard WebSocket (pushed via Postgres LISTEN/NOTIFY)
and optional WhatsApp inbound.
Explained in full: Admin orchestrator
ElevenLabs Multilingual v2 (TTS)
The text-to-speech provider used by the voice channel; renders the FSM's reply text into audio delivered over the LiveKit SIP session. Explained in full: Voice conversation
fetch_snippets()
The function in app/services/knowledge.py that retrieves
matching knowledge snippets for a given intent; the category→intent routing is:
services→{service, general}, hours→{hours, general}, other→all.
Explained in full: Knowledge answers
Full-duplex turn-taking
The voice-channel turn model where both the customer and the agent can be "in flight" simultaneously; managed by barge-in detection, backchannel filtering, and WPM-adaptive TTS speed so neither side blocks the other. Explained in full: Voice conversation
Handoff
The mechanism by which the AI agent cedes control of a conversation to the human tenant owner; triggered by one of five signals (d1–d5) and delivered via the admin rail (dashboard WS + optional WhatsApp briefing card). Explained in full: Admin orchestrator
Handoff briefing card
The structured JSONB payload (schema locked in ADR-0006) delivered to the admin rail when a handoff fires; contains the verbatim customer transcript, trigger signal, and an on-demand translation button. Explained in full: Admin orchestrator
Idempotent re-import
A re-upload of the same catalog matches existing services by
LOWER(name) and updates rather than duplicating; the catalog_imports.type column
distinguishes 'initial' from 're-import'.
Explained in full: Catalog onboarding
Intent classifier
The single bilingual (EN + SW) prompt that routes each inbound message to the correct LangGraph graph (booking, cancel, reschedule, admin); uses tenant-locale fallback when language confidence is low. Explained in full: Conversation FSM
Keycloak
The open-source identity provider Ratiba uses for tenant realms, admin phone-OTP authentication, and OIDC tokens for the dashboard frontend. Explained in full: Identity and tenancy
Knowledge snippet
A short, hand-seeded text block in the
<tenant>.knowledge_snippets table (category, title, body, language, is_active);
injected into the LLM answer-shaper user template to answer tenant-specific
questions without retrieval (Phase 0 "no-RAG RAG").
Explained in full: Knowledge answers
knowledge_gap_candidate
A structured log event emitted when the agent deflects a question that matched no knowledge snippet; used as the observable trigger for the Phase-0 → Phase-1 graduation decision (too many gaps → add real retrieval). Explained in full: Knowledge answers
LangGraph
The graph-execution framework (from LangChain) that runs Ratiba's booking, cancel, reschedule, and admin state machines; each node is a Python async function, and the checkpointer persists state to Postgres. Explained in full: Conversation FSM
Lean observability
Ratiba's current-reality monitoring posture: docker logs,
file tailing, structured structlog JSON output, and a daily WhatsApp digest — no
Loki, no Datadog, no Sentry until a paying production deployment justifies the cost.
Explained in full: Observability
Listening-ack
A brief audio token ("I'm listening") the voice agent plays immediately after the customer's turn ends and before the LLM response is ready, reducing perceived latency. Explained in full: Voice conversation
LiveKit SIP bridge
The telephony infrastructure that receives inbound phone calls as SIP legs, creates LiveKit rooms, and hands audio frames to the voice channel adapter. Explained in full: Voice conversation
LLMRouter
The abstraction (app/llm/router.py) that routes LLM calls to the
correct provider + model based on the call's role (e.g. answer_shaper → Claude
Haiku; vision → Anthropic Vision; narrow classifier → GPT-4.1 mini), enforcing
the cost ceiling.
Explained in full: Conversation FSM
M-Pesa STK push
The Safaricom "SIM Toolkit push" that sends a payment prompt directly to the customer's phone screen; the customer enters their M-Pesa PIN to approve, and Daraja calls Ratiba's webhook with the result. Explained in full: Payments
NL fallback
The natural-language branch of the admin message router; classifies
free-form admin text via LLMRouter into one of three confidence bands: ≥0.9 execute,
0.7–0.9 ask for confirmation, <0.7 graceful unknown.
Explained in full: Admin orchestrator
"No-RAG RAG"
The Phase 0 knowledge-answer approach: inject hand-seeded snippets into the LLM prompt instead of running vector retrieval; real embeddings + pgvector are YAGNI until a paying tenant's knowledge base outgrows the prompt context window. Explained in full: Knowledge answers
NotificationSink
A one-way outbound SMS surface (Africa's Talking) used only for reminder fallback when a Tier-2 customer is outside their 24-hour Meta session window; not an inbound channel. Explained in full: Channel substrate
Ordered-pair reservation
A Redis EVAL Lua script that atomically acquires SETNX
locks on both a primary and an adjacent secondary slot in a single round-trip; on
failure it rolls back the primary acquisition before returning.
Explained in full: Cross-sell
Payment FSM
The separate state machine (PaymentState) that runs after the
booking FSM reaches BOOKED; has a first-class PAYMENT_CANCELLED_BY_CUSTOMER
state with provider-specific reversal (Daraja: STK timeout; PesaPal: active void).
Explained in full: Payments
payment_callbacks_unrouted
The dead-letter table in public that catches Daraja
or PesaPal callbacks that arrive after their correlation row has expired from
payment_routing; requires manual triage.
Explained in full: Payments
payment_routing
The public.payment_routing correlation table that maps a
(checkout_request_id, tenant_id, payment_id) triple so the webhook handler can
route a Daraja or PesaPal callback into the correct per-tenant payment record.
Explained in full: Payments
Per-tenant configurable thresholds
Settings stored in per-tenant JSONB columns
(handoff_thresholds, cost ceiling overrides, etc.) that let individual tenants
override the platform defaults without a code change.
Explained in full: Configuration
Per-vertical YAML defaults
The 8-vertical × 6-dial block in
app/prompts/personality_defaults.yaml that gives every new tenant sensible
personality defaults for their sector (dental, spa, barbershop, etc.) without any
manual configuration.
Explained in full: Personality dials
PesaPal
The card-payments gateway Ratiba routes to for non-M-Pesa payments; never used for M-Pesa (cost discipline: Daraja-direct only). Explained in full: Payments
Phone-only deterministic identity
Ratiba matches customers across channels
exclusively by phone_e164; no probabilistic name or device-fingerprint matching
is used, which keeps the merge logic auditable and avoids false positives.
Explained in full: Identity and tenancy
Personality dials
Six curated per-tenant tuning knobs (Tone, Greeting, Upsell, Cancellation tone, Honorific, Cross-sell) that control how the AI agent sounds, without exposing raw prompt editing. Explained in full: Personality dials
Price safety floor
The hard rule that the catalog-onboarding LLM is never
permitted to auto-fill a price field; rows with no price stay None and the
admin must type them manually before commit.
Explained in full: Catalog onboarding
Prompt-cache invariant
The architectural rule that the LLM system message must remain byte-identical across all tenants and all turns, so Anthropic's prompt cache can reuse it; per-tenant data flows only via the user template. Explained in full: Personality dials
pytest-xdist
The pytest parallelism plugin used in Ratiba's test suite; run
with -n4 to cut backend CI from ~18 minutes to ~4 minutes; capped to avoid
Postgres connection exhaustion.
Explained in full: Testing
Schema-per-tenant
Ratiba's multi-tenancy isolation model: each tenant gets a
dedicated tenant_<slug> Postgres schema, migrated independently, with no cross-
tenant foreign keys.
Explained in full: Identity and tenancy
Service relation graph
The per-tenant graph of complementary, alternative,
and sequential relations between services, inferred by the LLM at catalog import
time and used by the cross-sell engine to pick candidates.
Explained in full: Cross-sell
Session window
The 24-hour Meta platform window within which Instagram DM and Messenger DM sessions are active; outside this window, only template messages can be sent; SMS is the reminder fallback for customers outside the window. Explained in full: Channel substrate
Single-snapshot rollback
The 7-day rollback window powered by
catalog_imports.snapshot_jsonb, which stores the pre-commit services state so a
bad import can be undone from the dashboard in one click.
Explained in full: Catalog onboarding
SETNX mutex (per-thread)
The Redis SET NX EX lock acquired at the start of
each booking turn to serialise in-flight messages on the same thread; TTL 30s with
exponential backoff to a 10s ceiling; prevents duplicate STK pushes on double-tap.
Explained in full: Conversation FSM
stkpushquery
Daraja's one-shot status-poll endpoint; Ratiba fires it exactly once at t=60s after the STK push to reconcile a payment whose callback hasn't arrived, avoiding long-poll loops. Explained in full: Payments
Structured logs
JSON-formatted log events emitted via Python structlog throughout
the backend; every event carries tenant_id, thread_id, and a dot-namespaced event
key (e.g. fsm.state_transition, admin.fanout.complete).
Explained in full: Observability
Tenant context
The frozen TenantContext snapshot installed by the per-request
middleware into the current_tenant ContextVar; carries tenant_id, schema_name,
and vertical so every layer below can read it without passing it explicitly.
Explained in full: Identity and tenancy
Testcontainers
The Docker-based test-isolation library that spins up a fresh
Postgres + Redis for each test scenario; Ratiba uses the per-scenario fresh-tenant
fixture (test_tenant_<scenario_id>_<run_id>) with full Alembic + PostgresSaver.setup() + DROP CASCADE teardown.
Explained in full: Testing
thread_id
A ULID generated fresh for every booking conversation via the
conversation_threads pointer table; scopes Redis hot state and Postgres checkpoints
to a single booking thread, preventing cross-booking state leaks.
Explained in full: Conversation FSM
Tier-1 channel
A channel (WhatsApp or voice) where the customer's phone number is known from inbound metadata; identity is resolved immediately without asking the customer for their phone. Explained in full: Channel substrate
Tier-2 channel
A channel (web widget, Instagram DM, or Messenger DM) where the
customer lands as an anonymous session; phone is captured progressively via the
COLLECT_PHONE FSM state.
Explained in full: Channel substrate
Two-pool model
The dual-connection-pool design: a shared asyncpg pool for
public.* registry queries (fast, low-cardinality) plus per-tenant psycopg
micro-pools for tenant_<slug>.* operational data.
Explained in full: Identity and tenancy
Two-tier persistence
The FSM storage design: Redis holds hot state for the live turn (sub-millisecond reads/writes), and Postgres holds durable LangGraph checkpoints for replay, audit, and 90-day retention. Explained in full: Conversation FSM
User-template splice
The mechanism by which AnswerShaper inserts per-tenant
personality directives into {personality_directive} slots in the LLM user template,
keeping the system message unchanged (and therefore Anthropic-cache-eligible) across
all tenants.
Explained in full: Personality dials
Voice full-duplex
The turn-taking model of the voice channel where both the customer and the agent can speak concurrently; barge-in, backchannels, and WPM-adaptive speed are part of the full-duplex surface. Explained in full: Voice conversation
VoiceStreamEvent
The typed event bus used by the voice channel to stream intermediate FSM results to the TTS pipeline; an agentic seam for a future streaming backend (the FSM remains authoritative today). Explained in full: Voice conversation
WhatsApp Cloud API (Meta direct)
Ratiba's WhatsApp channel uses Meta's first-party
Cloud API rather than a BSP, saving €49/month; per-tenant credentials are
whatsapp_phone_number_id + whatsapp_access_token, plus a project-level
WHATSAPP_APP_SECRET for HMAC verification.
Explained in full: Channel substrate
WPM-adaptive speed
A voice-channel feature that adjusts the TTS playback rate based on the words-per-minute of the customer's speech; faster talkers get slightly faster responses, improving natural feel. Explained in full: Voice conversation
By area
Use this grouping to browse terms by domain. Each term links to its alphabetical definition above (which in turn deep-links to the canonical page).
Tenancy
Alembic (per-tenant) ·
COLLECT_PHONE ·
Cross-channel identity merge ·
current_tenant ContextVar ·
Keycloak ·
Per-tenant configurable thresholds ·
Phone-only deterministic identity ·
Schema-per-tenant ·
Tenant context ·
Two-pool model
Conversation & FSM
BookingState ·
Checkpoint ·
Cost ceiling ·
Intent classifier ·
LangGraph ·
LLMRouter ·
SETNX mutex (per-thread) ·
thread_id ·
Two-tier persistence
Channels
Capability flags ·
Channel adapter ·
Channel substrate ·
Channel-switch token ·
NotificationSink ·
Session window ·
Tier-1 channel ·
Tier-2 channel ·
WhatsApp Cloud API
Payments
Daily 3 AM EAT reaper ·
Daraja / Safaricom Daraja API ·
M-Pesa STK push ·
Payment FSM ·
payment_callbacks_unrouted ·
payment_routing ·
PesaPal ·
stkpushquery
Personalisation & cross-sell
Adjacency window · Answer shaper · Cross-sell · Dial audit · DialBundle · Ordered-pair reservation · Per-vertical YAML defaults · Personality dials · Price safety floor · Prompt-cache invariant · Service relation graph · User-template splice
Catalog onboarding
Anthropic Vision OCR · Catalog onboarding · Idempotent re-import · Price safety floor · Single-snapshot rollback
Knowledge
fetch_snippets() ·
Knowledge snippet ·
knowledge_gap_candidate ·
"No-RAG RAG"
Voice
Backchannel ·
Barge-in ·
Deepgram Nova-3 (STT) ·
ElevenLabs Multilingual v2 (TTS) ·
Full-duplex turn-taking ·
Listening-ack ·
LiveKit SIP bridge ·
Voice full-duplex ·
VoiceStreamEvent ·
WPM-adaptive speed
Admin & handoff
Admin orchestrator · Admin slash commands · Dual-channel admin rail · Handoff · Handoff briefing card · NL fallback
Operations
Daily 3 AM EAT reaper · Lean observability · Structured logs
Testing
DeepEval ·
DeepEval 4-tuple cache key ·
pytest-xdist ·
Testcontainers
Configuration
Configuration / env vars · Per-tenant configurable thresholds
Missing a term?
If a concept is missing from this glossary, the fix is:
- Add a definition entry (alphabetically) with a
*Explained in full:*link to the canonical page. - Add a row to
docs/_partials/_concept-registry.md. - Run
cd docusaurus/ratiba && npm run buildto verify no broken links.