Local dev runbook

Day-to-day operations cookbook for an already-bootstrapped Ratiba checkout. If this is your first time, start at Dev setup — that page walks the one-time install (clone, .env, npm install, pip install -e .[dev]). This page is the daily driver: bring the stack up, smoke a booking, take it down, and tail the right log when something looks off.

The shape mirrors what Adrian runs every morning. Numbered steps below are exactly the sequence that gets a fresh laptop from zero to "I can chat with the agent on WhatsApp."

Boot timeline

The diagram below shows the full start-up sequence from docker compose up to a passing smoke test. Use it to orient yourself when something stalls mid-boot.

Pre-flight

./scripts/pilot-preflight.sh

The script is the single source of truth for "is my laptop ready." It walks four gates in order and exits non-zero on the first failure with an actionable next-step.

Gate	What it checks	If it fails
Environment variables	The 6 required keys (`WHATSAPP_APP_SECRET`, `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `AFRICAS_TALKING_API_KEY`, `LIVEKIT_API_KEY`, `LIVEKIT_API_SECRET`)	Open `.env`, set the missing key, re-run. Full env-var reference: Configuration.
Docker Compose health	Every service in `docker-compose.yml` is `healthy` (Postgres, Redis, Keycloak) or `running` (LiveKit; `network_mode: host`, no healthcheck)	Run the boot step below. If a container is unhealthy, see the Incidents runbook for per-service diagnosis.
Backend healthz	`GET http://localhost:8010/healthz` returns 200	Backend not yet started — proceed to the Backend bring-up step below.
Database migrations	The `tenants` table exists in `public` schema	`cd backend && /Users/soft4u/Development/ratiba/backend/.venv/bin/python -m alembic upgrade head`.

A green run looks like every row prefixed [OK]. Any [FAIL] row halts the script.

Boot the dependencies

docker compose up -d

Five containers come up in docker-compose.yml. Wait for all healthchecks before you start the backend:

docker compose ps

Service	Container	Host port	Health probe	Purpose
postgres	`ratiba-postgres`	`5434`	`pg_isready -U ratiba`	Multi-tenant data (ADR-0002).
redis	`ratiba-redis`	`6381`	`redis-cli -a ... ping`	Hot FSM state + idempotency locks (ADR-0003).
keycloak	`ratiba-keycloak`	`8281`	`KC_HEALTH_ENABLED=true`	Tenant realms + admin OIDC (ADR-0001).
livekit	`ratiba-livekit`	`7890` (signal), `52000-52050` (UDP RTC)	none — `network_mode: host`	Voice channel SIP bridge (ADR-0006 voice).

Wait pattern (poll until everything is healthy):

until docker compose ps --format json | jq -e '[.[] | select(.Health != "healthy" and .Health != "")] | length == 0' >/dev/null; do
  sleep 2
done

LiveKit will read as running (not healthy) because it has no healthcheck — that's expected.

Backend bring-up

The project pins Python 3.13. Always invoke the canonical venv directly — pyenv shims on macOS resolve to 3.12 and a handful of imports (keycloak-admin, livekit-agents) only build cleanly under 3.13. This is a pinned methodology lesson; see the project memory note feedback_test_venv_canonical.md.

cd backend
/Users/soft4u/Development/ratiba/backend/.venv/bin/python -m alembic upgrade head

Then start the backend + worker:

./start-server.sh

start-server.sh auto-kills any stale process on :8010 before starting, so re-running it is always safe. It writes uvicorn output to backend/.uvicorn.log. To follow live:

tail -f backend/.uvicorn.log

Verify it's up:

curl http://localhost:8010/healthz
# {"status":"ok"}

Frontend bring-up

cd frontend
npm install
./start-client.sh

The frontend is Next.js 14 + Tailwind v4 + shadcn/ui. First-install pulls ~600 MB; subsequent installs are cache-hot. start-client.sh mirrors start-server.sh — auto-kills :3010, writes to frontend/.next-dev.log.

Visit http://localhost:3010 — you land on the admin login (NextAuth → Keycloak).

Smoke a booking

This is the abbreviated ops version of First booking. Use it as a daily sanity check.

Open WhatsApp on your phone, message the Meta test number (configured per-tenant via whatsapp_phone_number_id on public.tenants — see Onboard a tenant if no test tenant exists yet).
Send: Hi I want to book a haircut tomorrow at 3pm

Tail the backend log filtered to your thread:

tail -f backend/.uvicorn.log | grep <thread_id>

Within ~2s the agent replies on WhatsApp with a confirmation prompt. Reply yes. The Daraja STK push fires; approve in the sandbox simulator at https://developer.safaricom.co.ke/test_credentials.

Final confirmation message arrives. Verify the row landed:

docker compose exec postgres psql -U ratiba ratiba \
  -c "SELECT id, status FROM tenant_<slug>.appointments ORDER BY created_at DESC LIMIT 1;"

If anything fails, see the Incidents runbook for the failure-mode-to-fix table. The full walkthrough with FSM-state-by-state log expectations lives at First booking.

Live-stack chat QA smoke test

For a broader sanity check than a single booking, scripts/qa/live_chat_qa.py drives a real headless Chromium against the running stack — two browser surfaces (customer widget + admin dashboard), desktop and mobile — and asserts the full conversational chat UX end-to-end. It is a manual operational harness, not part of the pytest suite: it needs the live dev stack up (backend :8010, frontend :3010) and a seeded spa_pilot tenant.

It covers 22 checks across the chat surface:

Area	Checks
Customer widget	Bootstrap + per-tenant branding; service list renders as real bullet markers; tappable quick-reply chips (staff / slot / cross-sell); an under-specified service request (e.g. "book a massage") returns a disambiguation menu + chips instead of a handoff; a full booking through to the confirmation recap; `KES n,nnn` money format
Admin dashboard	Keycloak login; business-name branding (not "Ratiba"); `/help` + numbered `/today` (no raw UUIDs) + `KES` money
Mobile (390 px)	No horizontal overflow on the widget and the dashboard / chat / calendar / team pages

Run it with a Playwright-equipped interpreter — the backend venv does not ship Playwright:

# install once into whatever interpreter you use for QA:
#   pip install playwright && playwright install chromium
python scripts/qa/live_chat_qa.py

Exit code is 0 only when all 22 checks pass; per-check [PASS]/[FAIL] lines print to stdout and screenshots land in $RATIBA_QA_SHOTS (default /tmp/ratiba_qa_full) for visual review. URLs and admin credentials are environment-overridable (WIDGET_URL, ADMIN_URL, ADMIN_USER, ADMIN_PASS, RATIBA_QA_SHOTS) so the script survives port or credential changes without edits.

Daily-driver workflow

Beyond a single smoke booking, here's how the stack fits into a typical development session:

Morning boot — docker compose up -d then ./scripts/pilot-preflight.sh. If anything is red, fix before writing code.
Iterate on backend code — start-server.sh picks up file changes via uvicorn --reload. No restart needed for Python changes.
Iterate on frontend code — Next.js dev server hot-reloads on save. If Tailwind classes go stale after a config change, delete frontend/.next/ and restart start-client.sh.
Run tests for a changed module — see Testing runbook for the exact invocation. Short form: cd backend && /Users/soft4u/Development/ratiba/backend/.venv/bin/python -m pytest tests/path/to/test_file.py -v. Always use the canonical venv, always from backend/.
Add a knowledge snippet for the test tenant — cd backend && /Users/soft4u/Development/ratiba/backend/.venv/bin/python scripts/seed_knowledge.py --tenant spa_pilot. See Seed data for the full options.
Tail logs — see the Tail-log shortcuts below for per-service commands; see Observability for structured log field reference and jq filter patterns.
End of day — docker compose down to keep data, or docker compose down -v for a clean slate. Ctrl-C the two start-*.sh terminals.

Tail-log shortcuts

For full log-reading guidance — structured field reference, jq filter recipes, and the daily digest format — see Observability. Quick reference:

Service	Tail command
Backend (FastAPI + worker)	`tail -f backend/.uvicorn.log`
Frontend (Next.js dev)	`tail -f frontend/.next-dev.log`
Postgres	`docker compose logs -f postgres`
Redis	`docker compose logs -f redis`
Keycloak	`docker compose logs -f keycloak`
LiveKit	`docker compose logs -f livekit`
All compose services interleaved	`docker compose logs -f`

For per-thread FSM tracing: every booking gets a ULID thread_id minted at the first inbound message. Grep for it across the backend log to see the full conversation turn-by-turn:

grep "thread=01HXY..." backend/.uvicorn.log

To watch Redis traffic while a booking is in flight:

docker compose exec redis redis-cli -a ratiba_redis_password MONITOR

Tear-down

End-of-day, two options:

# Keep data — fast restart tomorrow
docker compose down

# Drop volumes — fresh database next boot. WARNING: drops all tenants, appointments, payments.
docker compose down -v

Use -v only when you actively want a clean slate (e.g., before running the full integration test suite, or after a schema migration that left mixed state). For day-to-day work, plain down keeps your test tenants and onboarded catalog intact.

The processes started by start-server.sh and start-client.sh are foreground processes in their terminals — Ctrl-C stops them. No background cleanup needed.

Common slowdowns

A few patterns that will save you minutes the next time:

Keycloak takes 20-30s to become healthy on first boot. Realm import + JIT compilation. pilot-preflight.sh accounts for this; running it gates correctly. If you script the boot, poll for docker compose ps health rather than fixed-sleep.
LiveKit logs ICE warnings on macOS. Expected — network_mode: host plus loopback ICE candidates produce noisy STUN logs. The voice channel still works; ignore the warnings unless an actual SIP register fails.
alembic upgrade head from backend/ not from the repo root. Alembic resolves alembic.ini relative to its CWD. From the wrong directory you get FAILED: No 'script_location' key found.
First-time npm install looks frozen for 60-90s. It's resolving the shadcn + Tailwind v4 dependency graph; subsequent installs are cache-hot.
knowledge_overflow WARN in backend logs. The per-tenant knowledge snippets table has grown past the ~20-snippet Phase-0 cap (ADR-0013). Deactivate low-priority snippets or promote to Phase-1 pgvector retrieval. See the Incidents runbook for the exact fix command.

What next

Pilot deployment runbook — the script for the M12 W5 pilot validation pass and M13 real-tenant onboarding.
Incidents runbook — failure modes table with diagnosis + copy-pastable fixes.
Testing runbook — running the full backend pytest + frontend Vitest suites, DeepEval calibration, and CI gate matrix.
Observability — structured log fields, jq filters, and the daily WhatsApp digest.
Quickstart / First booking — full walkthrough with FSM-state-by-state log expectations.

Boot timeline​

Pre-flight​

Boot the dependencies​

Backend bring-up​

Frontend bring-up​

Smoke a booking​

Live-stack chat QA smoke test​

Daily-driver workflow​

Tail-log shortcuts​

Tear-down​

Common slowdowns​

What next​