Glossary

Ratiba's docs are DRY — each concept is explained in full in exactly one place. This glossary defines every term and links you to that one canonical explanation. If you find a concept explained twice, the second explanation is a bug: fix it with a one-line gloss + a deep link.

A–Z definitions

Admin orchestrator

The shallow 4-state FSM (IDLE ↔ ENGAGED) through which a tenant owner reclaims a customer conversation when the AI loses confidence; exposes 11 slash commands and a natural-language fallback. Explained in full: Admin orchestrator

Admin slash commands

11 commands the tenant owner can send via dashboard or WhatsApp to view today's bookings, cancel/reschedule, hand back to the customer, or mutate the catalog (/add-service, /update-price, etc.). Explained in full: Admin orchestrator

Adjacency window

The ±15-minute window around primary.end_at within which a cross-sell offer is eligible; only post-primary slots are offered (not pre-primary). Explained in full: Cross-sell

Alembic (per-tenant)

Each tenant schema is migrated independently via its own Alembic invocation, so tenant migrations are isolated from the public registry migrations and from each other. Explained in full: Identity and tenancy

Answer shaper / `answer_shaper`

The service (app/services/answer_shaper.py) that splices per-tenant personality directives and knowledge snippets into the LLM user template before the LLMRouter fires; the system message stays byte-identical across tenants to preserve Anthropic prompt-cache eligibility. Explained in full: Personality dials

Anthropic Vision OCR

The image/PDF extraction path in the catalog-onboarding pipeline; Anthropic Vision is called via the LLMRouter vision role to read service menu images and PDFs into structured ExtractedRow objects. Explained in full: Catalog onboarding

Backchannel

Short acoustic acknowledgements ("mmh", "sawa") the voice agent emits while the customer is still speaking, signalling active listening without interrupting the turn. Explained in full: Voice conversation

Barge-in

The ability for the customer to speak and interrupt TTS playback mid-sentence; the voice agent detects the interruption and yields the floor immediately without waiting for the audio to finish. Explained in full: Voice conversation

`BookingState`

The Pydantic v2 state shape (extra="forbid") that flows through every LangGraph node in the booking FSM; versioned via FSM_GRAPH_VERSION and back-filled by a migration shim on hydration. Explained in full: Conversation FSM

Capability flags

Five per-channel flags (tier, identity_anchor, session_window_seconds, supports_payment_inline, supports_rich_media) declared by each channel adapter; the substrate routes runtime behaviour off these flags without branching the FSM. Explained in full: Channel substrate

Catalog onboarding

The pipeline that lets tenants self-onboard their service catalog from an image, PDF, CSV, or text; includes Vision OCR, LLM gap-fill, relation inference, a human review step, and an atomic idempotent commit. Explained in full: Catalog onboarding

Channel adapter

A concrete implementation of the Channel protocol that handles one surface (WhatsApp, voice, web widget, Instagram DM, or Messenger DM); translates channel-specific I/O into the uniform dispatch_inbound_message seam. Explained in full: Channel substrate

Channel substrate

The Channel protocol + ChannelKind enum + Tier enum at app/channels/_base/; makes the five customer-facing surfaces look identical to the conversation FSM. Explained in full: Channel substrate

Channel-switch token

A single-use 24-hour token that lets a customer resume a conversation on a different channel (e.g. web widget → WhatsApp); opt-in per tenant. Explained in full: Channel substrate

Checkpoint

A durable LangGraph state snapshot persisted to checkpoints_<slug>.checkpoints in Postgres; the warm tier of the two-tier FSM persistence (Redis = hot, Postgres = warm). Explained in full: Conversation FSM

COLLECT_PHONE

The FSM entry-state that Tier-2 channel sessions (web, Instagram, Messenger) pass through on first contact; the agent asks for the customer's phone, binds it, and merges the session into any existing customer record. Explained in full: Identity and tenancy

Configuration / env vars

The full set of environment variables and Settings Pydantic model that control every tunable in Ratiba (ports, credentials, feature flags, per-tenant thresholds). Explained in full: Configuration

Cost ceiling

The $0.05 soft / $0.20 hard per-booking LLM spend cap (per-tenant configurable), tracked via total_token_cost_usd on BookingState; one of three signals that can trigger a handoff. Explained in full: Conversation FSM

Cross-channel identity merge

When a customer who previously used WhatsApp later opens the web widget and binds the same phone, the resolver updates the session to reference the existing customer_id — no duplicate customer record is created. Explained in full: Identity and tenancy

Cross-sell

A slot-aware offer of a complementary service immediately after the primary booking is confirmed; fires only when an adjacent post-primary slot exists within ±15 minutes and the tenant's cross_sell dial is not never. Explained in full: Cross-sell

`current_tenant` ContextVar

The asyncio ContextVar at app/tenancy/context.py::current_tenant that propagates tenant identity through async call stacks; get_tenant_session() raises loudly if unset, never silently falling back to public. Explained in full: Identity and tenancy

Daily 3 AM EAT reaper

A scheduled worker that sweeps three tables every day: expired rows in public.payment_routing, rows older than 90 days in <tenant>.checkpoints_archive, and rows older than 90 days in <tenant>.handoff_log_archive. Explained in full: Payments

Daraja / Safaricom Daraja API

Safaricom's REST API for M-Pesa integrations; Ratiba uses it for STK push (customer-initiated payment prompt) and one-shot stkpushquery reconciliation at t=60s. Explained in full: Payments

DeepEval

The primary eval runner for Ratiba's AI quality gate; runs LLM-as-judge metrics against transcript replays using a 4-tuple cache key (scenario_id, prompt_version, judge_model_version, metric_version). Explained in full: Testing

DeepEval 4-tuple cache key

The cache key (scenario_id, prompt_version, judge_model_version, metric_version) used to avoid re-running expensive LLM-judge evaluations when neither the prompt nor the judge model has changed. Explained in full: Testing

Deepgram Nova-3 (STT)

The streaming speech-to-text provider used by the voice channel; supports Swahili and English language detection; emits interim and final transcripts consumed by the end-of-turn detector and the barge-in / backchannel primitives. Explained in full: Voice conversation

Dial audit

A row written to <tenant>.dial_audits on every personality-dial change, recording dial, before JSONB, and after JSONB; supports support-tooling queries and rollback. Explained in full: Personality dials

DialBundle

A frozen Pydantic dataclass carrying the resolved 6-dial values for a tenant turn; to_template_vars() renders it into the {personality_directive} slot variables that the answer shaper splices into the user template. Explained in full: Personality dials

Dual-channel admin rail

The two surfaces through which a tenant owner operates the admin orchestrator: a dashboard WebSocket (pushed via Postgres LISTEN/NOTIFY) and optional WhatsApp inbound. Explained in full: Admin orchestrator

ElevenLabs Multilingual v2 (TTS)

The text-to-speech provider used by the voice channel; renders the FSM's reply text into audio delivered over the LiveKit SIP session. Explained in full: Voice conversation

`fetch_snippets()`

The function in app/services/knowledge.py that retrieves matching knowledge snippets for a given intent; the category→intent routing is: services→{service, general}, hours→{hours, general}, other→all. Explained in full: Knowledge answers

Full-duplex turn-taking

The voice-channel turn model where both the customer and the agent can be "in flight" simultaneously; managed by barge-in detection, backchannel filtering, and WPM-adaptive TTS speed so neither side blocks the other. Explained in full: Voice conversation

Handoff

The mechanism by which the AI agent cedes control of a conversation to the human tenant owner; triggered by one of five signals (d1–d5) and delivered via the admin rail (dashboard WS + optional WhatsApp briefing card). Explained in full: Admin orchestrator

Handoff briefing card

The structured JSONB payload (schema locked in ADR-0006) delivered to the admin rail when a handoff fires; contains the verbatim customer transcript, trigger signal, and an on-demand translation button. Explained in full: Admin orchestrator

Idempotent re-import

A re-upload of the same catalog matches existing services by LOWER(name) and updates rather than duplicating; the catalog_imports.type column distinguishes 'initial' from 're-import'. Explained in full: Catalog onboarding

Intent classifier

The single bilingual (EN + SW) prompt that routes each inbound message to the correct LangGraph graph (booking, cancel, reschedule, admin); uses tenant-locale fallback when language confidence is low. Explained in full: Conversation FSM

Keycloak

The open-source identity provider Ratiba uses for tenant realms, admin phone-OTP authentication, and OIDC tokens for the dashboard frontend. Explained in full: Identity and tenancy

Knowledge snippet

A short, hand-seeded text block in the <tenant>.knowledge_snippets table (category, title, body, language, is_active); injected into the LLM answer-shaper user template to answer tenant-specific questions without retrieval (Phase 0 "no-RAG RAG"). Explained in full: Knowledge answers

`knowledge_gap_candidate`

A structured log event emitted when the agent deflects a question that matched no knowledge snippet; used as the observable trigger for the Phase-0 → Phase-1 graduation decision (too many gaps → add real retrieval). Explained in full: Knowledge answers

LangGraph

The graph-execution framework (from LangChain) that runs Ratiba's booking, cancel, reschedule, and admin state machines; each node is a Python async function, and the checkpointer persists state to Postgres. Explained in full: Conversation FSM

Lean observability

Ratiba's current-reality monitoring posture: docker logs, file tailing, structured structlog JSON output, and a daily WhatsApp digest — no Loki, no Datadog, no Sentry until a paying production deployment justifies the cost. Explained in full: Observability

Listening-ack

A brief audio token ("I'm listening") the voice agent plays immediately after the customer's turn ends and before the LLM response is ready, reducing perceived latency. Explained in full: Voice conversation

LiveKit SIP bridge

The telephony infrastructure that receives inbound phone calls as SIP legs, creates LiveKit rooms, and hands audio frames to the voice channel adapter. Explained in full: Voice conversation

LLMRouter

The abstraction (app/llm/router.py) that routes LLM calls to the correct provider + model based on the call's role (e.g. answer_shaper → Claude Haiku; vision → Anthropic Vision; narrow classifier → GPT-4.1 mini), enforcing the cost ceiling. Explained in full: Conversation FSM

M-Pesa STK push

The Safaricom "SIM Toolkit push" that sends a payment prompt directly to the customer's phone screen; the customer enters their M-Pesa PIN to approve, and Daraja calls Ratiba's webhook with the result. Explained in full: Payments

NL fallback

The natural-language branch of the admin message router; classifies free-form admin text via LLMRouter into one of three confidence bands: ≥0.9 execute, 0.7–0.9 ask for confirmation, <0.7 graceful unknown. Explained in full: Admin orchestrator

"No-RAG RAG"

The Phase 0 knowledge-answer approach: inject hand-seeded snippets into the LLM prompt instead of running vector retrieval; real embeddings + pgvector are YAGNI until a paying tenant's knowledge base outgrows the prompt context window. Explained in full: Knowledge answers

`NotificationSink`

A one-way outbound SMS surface (Africa's Talking) used only for reminder fallback when a Tier-2 customer is outside their 24-hour Meta session window; not an inbound channel. Explained in full: Channel substrate

Ordered-pair reservation

A Redis EVAL Lua script that atomically acquires SETNX locks on both a primary and an adjacent secondary slot in a single round-trip; on failure it rolls back the primary acquisition before returning. Explained in full: Cross-sell

Payment FSM

The separate state machine (PaymentState) that runs after the booking FSM reaches BOOKED; has a first-class PAYMENT_CANCELLED_BY_CUSTOMER state with provider-specific reversal (Daraja: STK timeout; PesaPal: active void). Explained in full: Payments

`payment_callbacks_unrouted`

The dead-letter table in public that catches Daraja or PesaPal callbacks that arrive after their correlation row has expired from payment_routing; requires manual triage. Explained in full: Payments

`payment_routing`

The public.payment_routing correlation table that maps a (checkout_request_id, tenant_id, payment_id) triple so the webhook handler can route a Daraja or PesaPal callback into the correct per-tenant payment record. Explained in full: Payments

Per-tenant configurable thresholds

Settings stored in per-tenant JSONB columns (handoff_thresholds, cost ceiling overrides, etc.) that let individual tenants override the platform defaults without a code change. Explained in full: Configuration

Per-vertical YAML defaults

The 8-vertical × 6-dial block in app/prompts/personality_defaults.yaml that gives every new tenant sensible personality defaults for their sector (dental, spa, barbershop, etc.) without any manual configuration. Explained in full: Personality dials

PesaPal

The card-payments gateway Ratiba routes to for non-M-Pesa payments; never used for M-Pesa (cost discipline: Daraja-direct only). Explained in full: Payments

Phone-only deterministic identity

Ratiba matches customers across channels exclusively by phone_e164; no probabilistic name or device-fingerprint matching is used, which keeps the merge logic auditable and avoids false positives. Explained in full: Identity and tenancy

Personality dials

Six curated per-tenant tuning knobs (Tone, Greeting, Upsell, Cancellation tone, Honorific, Cross-sell) that control how the AI agent sounds, without exposing raw prompt editing. Explained in full: Personality dials

Price safety floor

The hard rule that the catalog-onboarding LLM is never permitted to auto-fill a price field; rows with no price stay None and the admin must type them manually before commit. Explained in full: Catalog onboarding

Prompt-cache invariant

The architectural rule that the LLM system message must remain byte-identical across all tenants and all turns, so Anthropic's prompt cache can reuse it; per-tenant data flows only via the user template. Explained in full: Personality dials

`pytest-xdist`

The pytest parallelism plugin used in Ratiba's test suite; run with -n4 to cut backend CI from ~18 minutes to ~4 minutes; capped to avoid Postgres connection exhaustion. Explained in full: Testing

Schema-per-tenant

Ratiba's multi-tenancy isolation model: each tenant gets a dedicated tenant_<slug> Postgres schema, migrated independently, with no cross- tenant foreign keys. Explained in full: Identity and tenancy

Service relation graph

The per-tenant graph of complementary, alternative, and sequential relations between services, inferred by the LLM at catalog import time and used by the cross-sell engine to pick candidates. Explained in full: Cross-sell

Session window

The 24-hour Meta platform window within which Instagram DM and Messenger DM sessions are active; outside this window, only template messages can be sent; SMS is the reminder fallback for customers outside the window. Explained in full: Channel substrate

Single-snapshot rollback

The 7-day rollback window powered by catalog_imports.snapshot_jsonb, which stores the pre-commit services state so a bad import can be undone from the dashboard in one click. Explained in full: Catalog onboarding

SETNX mutex (per-thread)

The Redis SET NX EX lock acquired at the start of each booking turn to serialise in-flight messages on the same thread; TTL 30s with exponential backoff to a 10s ceiling; prevents duplicate STK pushes on double-tap. Explained in full: Conversation FSM

`stkpushquery`

Daraja's one-shot status-poll endpoint; Ratiba fires it exactly once at t=60s after the STK push to reconcile a payment whose callback hasn't arrived, avoiding long-poll loops. Explained in full: Payments

Structured logs

JSON-formatted log events emitted via Python structlog throughout the backend; every event carries tenant_id, thread_id, and a dot-namespaced event key (e.g. fsm.state_transition, admin.fanout.complete). Explained in full: Observability

Tenant context

The frozen TenantContext snapshot installed by the per-request middleware into the current_tenant ContextVar; carries tenant_id, schema_name, and vertical so every layer below can read it without passing it explicitly. Explained in full: Identity and tenancy

Testcontainers

The Docker-based test-isolation library that spins up a fresh Postgres + Redis for each test scenario; Ratiba uses the per-scenario fresh-tenant fixture (test_tenant_<scenario_id>_<run_id>) with full Alembic + PostgresSaver.setup() + DROP CASCADE teardown. Explained in full: Testing

`thread_id`

A ULID generated fresh for every booking conversation via the conversation_threads pointer table; scopes Redis hot state and Postgres checkpoints to a single booking thread, preventing cross-booking state leaks. Explained in full: Conversation FSM

Tier-1 channel

A channel (WhatsApp or voice) where the customer's phone number is known from inbound metadata; identity is resolved immediately without asking the customer for their phone. Explained in full: Channel substrate

Tier-2 channel

A channel (web widget, Instagram DM, or Messenger DM) where the customer lands as an anonymous session; phone is captured progressively via the COLLECT_PHONE FSM state. Explained in full: Channel substrate

Two-pool model

The dual-connection-pool design: a shared asyncpg pool for public.* registry queries (fast, low-cardinality) plus per-tenant psycopg micro-pools for tenant_<slug>.* operational data. Explained in full: Identity and tenancy

Two-tier persistence

The FSM storage design: Redis holds hot state for the live turn (sub-millisecond reads/writes), and Postgres holds durable LangGraph checkpoints for replay, audit, and 90-day retention. Explained in full: Conversation FSM

User-template splice

The mechanism by which AnswerShaper inserts per-tenant personality directives into {personality_directive} slots in the LLM user template, keeping the system message unchanged (and therefore Anthropic-cache-eligible) across all tenants. Explained in full: Personality dials

Voice full-duplex

The turn-taking model of the voice channel where both the customer and the agent can speak concurrently; barge-in, backchannels, and WPM-adaptive speed are part of the full-duplex surface. Explained in full: Voice conversation

`VoiceStreamEvent`

The typed event bus used by the voice channel to stream intermediate FSM results to the TTS pipeline; an agentic seam for a future streaming backend (the FSM remains authoritative today). Explained in full: Voice conversation

WhatsApp Cloud API (Meta direct)

Ratiba's WhatsApp channel uses Meta's first-party Cloud API rather than a BSP, saving €49/month; per-tenant credentials are whatsapp_phone_number_id + whatsapp_access_token, plus a project-level WHATSAPP_APP_SECRET for HMAC verification. Explained in full: Channel substrate

WPM-adaptive speed

A voice-channel feature that adjusts the TTS playback rate based on the words-per-minute of the customer's speech; faster talkers get slightly faster responses, improving natural feel. Explained in full: Voice conversation

By area

Use this grouping to browse terms by domain. Each term links to its alphabetical definition above (which in turn deep-links to the canonical page).

Missing a term?

If a concept is missing from this glossary, the fix is:

Add a definition entry (alphabetically) with a *Explained in full:* link to the canonical page.
Add a row to docs/_partials/_concept-registry.md.
Run cd docusaurus/ratiba && npm run build to verify no broken links.

A–Z definitions​

Admin orchestrator​

Admin slash commands​

Adjacency window​

Alembic (per-tenant)​

Answer shaper / answer_shaper​

Anthropic Vision OCR​

Backchannel​

Barge-in​

BookingState​

Capability flags​

Catalog onboarding​

Channel adapter​

Channel substrate​

Channel-switch token​

Checkpoint​

COLLECT_PHONE​

Configuration / env vars​

Cost ceiling​

Cross-channel identity merge​

Cross-sell​

current_tenant ContextVar​

Daily 3 AM EAT reaper​

Daraja / Safaricom Daraja API​

DeepEval​

DeepEval 4-tuple cache key​

Deepgram Nova-3 (STT)​

Dial audit​

DialBundle​

Dual-channel admin rail​

ElevenLabs Multilingual v2 (TTS)​

fetch_snippets()​

Full-duplex turn-taking​

Handoff​

Handoff briefing card​

Idempotent re-import​

Intent classifier​

Keycloak​

Knowledge snippet​

knowledge_gap_candidate​

LangGraph​

Lean observability​

Listening-ack​

LiveKit SIP bridge​

LLMRouter​

M-Pesa STK push​

NL fallback​

"No-RAG RAG"​

NotificationSink​

Ordered-pair reservation​

Payment FSM​

payment_callbacks_unrouted​

payment_routing​

Per-tenant configurable thresholds​

Per-vertical YAML defaults​

PesaPal​

Phone-only deterministic identity​

Personality dials​

Price safety floor​

Prompt-cache invariant​

pytest-xdist​

Schema-per-tenant​

Service relation graph​

Session window​

Single-snapshot rollback​

SETNX mutex (per-thread)​

stkpushquery​

Structured logs​

Tenant context​

Testcontainers​

thread_id​

Tier-1 channel​

Tier-2 channel​

Two-pool model​

Two-tier persistence​

User-template splice​

Voice full-duplex​

VoiceStreamEvent​

WhatsApp Cloud API (Meta direct)​

WPM-adaptive speed​