Personality dials
What it does
Tenants tune how the agent sounds — without rewriting prompts. Per ADR-0010 D2–D4 + D9, six curated dials cover the entire surface area exposed in v1: Tone (warm / professional / playful), Greeting (default-bilingual / custom-string ≤ 140 chars), Upsell (never / suggest-once-after-confirm / suggest-during-slot-collection), Cancellation tone (forgiving / neutral / firm), Honorific (first-name / formal-en / formal-sw), and Cross-sell (never / related-only / full-suggest).
Defaults are vertical-aware. Eight verticals each carry an opinionated
6-dial block in app/prompts/personality_defaults.yaml. A dental clinic
gets professional + formal-sw + never-upsell + firm-cancellation
out of the box; a barbershop gets playful + first-name +
suggest-once-after-confirm. Tenants override per-dial via the singleton
row in <tenant>.tenant_personality_config; NULL columns inherit the
vertical default. Every change writes a <tenant>.dial_audits row with
before / after JSONB.
How it fits in the system
The hardest part of the architecture is the negative edge: there is no
path from DialBundle into the system message. The tenant_personality
kwarg on LLMRouter.complete is observability metadata only — by the time
messages reaches the router, the per-tenant directives are already baked
into the user template, and the system message is the same string for every
tenant on the platform.
How it flows
The merge is per-dial, not whole-row. A tenant who overrides only tone
gets the YAML defaults for the other five dials — so an admin changing one
dial never accidentally clobbers the vertical's other defaults. NULL means
"inherit", and INHERIT_SENTINEL = {"inherit": "vertical_default"} lands in
the audit after column when an admin actively resets a dial back to the
default (so support tooling can distinguish "never set" from "explicitly
reset" without joining the YAML).
Dial reference
All six dials with their locked v1 value sets and the behavioural effect each value produces:
| Dial | Values | Default (dental) | Effect |
|---|---|---|---|
| Tone | warm / professional / playful | professional | Controls the register of every agent utterance; playful enables contractions and emoji; warm adds empathic openers |
| Greeting | default-bilingual / custom-<string> | default-bilingual | Opening message of the first turn; custom string is ≤ 140 chars, tenant-authored, injected verbatim |
| Upsell | never / suggest-once-after-confirm / suggest-during-slot-collection | never | Controls when the agent may suggest a higher-value service; also gates cross-sell (see Cross-sell) |
| Cancellation tone | forgiving / neutral / firm | firm | Shapes the "I'm sorry to hear you want to cancel" branch; firm reminds about the policy upfront |
| Honorific | first-name / formal-en / formal-sw | formal-sw | How the agent addresses the customer; formal-sw uses "Bwana/Bibi" + surname; formal-en uses "Mr/Ms" + surname |
| Cross-sell | never / related-only / full-suggest | never | Whether a complementary-service offer fires post-confirm; related-only = complementary edges only; full-suggest reserved (behaves as related-only in v1) |
Per-vertical YAML defaults
Every new tenant inherits a vertical-appropriate baseline from
app/prompts/personality_defaults.yaml. Representative comparison:
| Vertical | Tone | Upsell | Honorific | Cross-sell | Cancellation |
|---|---|---|---|---|---|
| dental | professional | never | formal-sw | never | firm |
| medical | professional | never | formal-sw | never | firm |
| legal | professional | never | formal-en | never | firm |
| physio | warm | never | formal-en | never | neutral |
| spa | warm | suggest-once-after-confirm | first-name | related-only | forgiving |
| salon | warm | suggest-once-after-confirm | first-name | related-only | neutral |
| barbershop | playful | suggest-once-after-confirm | first-name | related-only | forgiving |
| tutoring | warm | never | formal-en | never | neutral |
Clinical verticals (dental, medical, legal) default to the strictest settings — no upsell, formal address, firm cancellation — because a patient being pushed to add a service feels ethically problematic. Lifestyle verticals (spa, salon, barbershop) can afford the softer, more commercial register. Tenants override any dial that doesn't fit their specific practice.
DialBundle
DialBundle is a frozen Pydantic dataclass carrying the six resolved
dial values for a single turn. Its API surface:
DialBundle.to_template_vars() → dict[str, str]— renders each dial into a natural-language directive sentence. The output dict keys arepersonality_directive,greeting_directive,upsell_directive,cancellation_directive,honorific_directive, andcross_sell_directive.- Each value is a short English sentence (e.g.
"Address the customer warmly as Bwana or Bibi [surname].") that the answer_shaper splices directly into the{personality_directive}slot in the user template. DialBundlefields areLiteral-typed and validated bypydantic.TypeAdapteratfor_tenant()call time. Drift between a manually-edited YAML and the locked ADR-0010 D3 enum raisesInvalidDialValueErrorat the first post-edit turn — not as a malformed prompt several turns later.
Dial audit
Every dial change — whether via the admin API or a direct DB write — writes
a row to <tenant>.dial_audits with:
dial: which of the six dials changedbefore: previous value JSONB (or NULL on first set)after: new value JSONB, or{"inherit": "vertical_default"}if the admin actively reset to default
The sentinel value lets support tooling distinguish "this tenant never configured this dial" from "this tenant explicitly reset it to the default" without joining the YAML.
answer_shaper
AnswerShaper.shape_answer() (app/services/answer_shaper.py) is the
service that assembles the final messages list for every LLM call across
all inquiry and booking reply paths. It holds two responsibilities:
- Splice per-tenant personality directives from
DialBundle.to_template_vars()into the{personality_directive}(and sibling) slots in the user template — so the agent sounds like the tenant configured. - Splice knowledge snippets from
raw_data["knowledge"](Phase 0) into the same user template — so the agent answers tenant-specific questions from hand-seeded KB rows without a retrieval system.
Both splice operations land in the user message only. The system message
is assembled once from answer_shaper.yaml and is never touched by either
operation. This is the mechanism by which the
prompt-cache invariant is preserved.
shape_answer() is the only caller of LLMRouter.complete() for the answer
path; tenant_personality=vars is passed as a kwarg for observability
(traces, cost attribution) but never re-read to construct the messages list.
Prompt-cache invariant
The prompt-cache invariant is the rule that the LLM system message must remain byte-identical across all tenants, all verticals, and all turns — so Anthropic's prompt-cache layer reuses the same cached prefix on every call rather than re-computing it.
Why it matters
Anthropic's prompt caching charges the full input-token price for the first
call that primes the cache, then a discounted rate (~10 %) for every
subsequent cache hit. The system message for answer_shaper is ~2 000 tokens.
With tens of bookings per day across multiple tenants, a byte-identical system
message means most calls hit the cache and the effective per-call cost drops
by ~90 % for that prefix.
If per-tenant data leaked into the system message — even one character of a tenant's business name — each tenant would prime a separate cache entry and the savings would vanish.
How the invariant is enforced
- Green path (system_message): assembled once from the YAML template; no tenant or turn data is ever interpolated; byte-identical for all callers.
- Orange path (user template): assembled per-turn; carries all per-tenant content (personality directives, knowledge snippets, structured booking data, customer utterance). Cache does not cover this part — but it is also much smaller than the system message.
The tenant_personality kwarg on LLMRouter.complete() carries the same
DialBundle values as observability metadata for tracing and cost
attribution. It is never re-read to build the messages list — the directives
are already in the user template by the time the router sees the call.
Pages that depend on this invariant
- Knowledge answers — snippets are injected via
raw_data["knowledge"]into the user template, following the same pattern as personality directives. See Knowledge answers. - Voice conversation — the voice channel calls
shape_answer()on the same path; the system message is cache-eligible there too. See Voice conversation.
Where it lives in code
| Concern | File | Key entry point |
|---|---|---|
| Dial bundle resolver | backend/app/services/personality_config.py:328 | PersonalityConfig.for_tenant() (line 339) |
| DialBundle dataclass + render | backend/app/services/personality_config.py:132 | DialBundle.to_template_vars() (line 153) |
| Override DAO | backend/app/persistence/personality_config.py:71 | fetch_personality_overrides() |
| Audit + upsert DAO | backend/app/persistence/personality_config.py:103 | upsert_personality_override() |
| LLMRouter signature | backend/app/llm/router.py:130 | LLMRouter.complete() (tenant_personality kwarg, line 138) |
| AnswerShaper splice | backend/app/services/answer_shaper.py:345 | shape_answer() (personality_directive render var, line 430) |
| Per-vertical defaults | backend/app/prompts/personality_defaults.yaml | 8 verticals × 6 dials |
The Literal-typed dials on DialBundle are validated via
pydantic.TypeAdapter at resolution time — drift between the YAML and the
locked ADR-0010 D3 enum surfaces as InvalidDialValueError at the first
for_tenant() call after the bad write, not as a malformed prompt several
turns later.
Decisions
- ADR-0010 — Per-tenant personality + voice configuration (D2 dial set + D3 locked v1 enums + D4 per-vertical YAML defaults + D9 user-template splice cache invariant).
Try this on local dev
-
Set a dial via the admin API. With
docker compose up -drunning and a dev tenant onboarded, PATCH the dial endpoint and watch the next agent reply shift tone:curl -i -X PATCH http://localhost:8010/admin/personality \-H "Authorization: Bearer <admin-jwt>" \-H "Content-Type: application/json" \-d '{"dial": "tone", "value": "professional"}'Send a fresh inbound message on the dev WhatsApp number; the next agent reply should drop the casual register.
-
Inspect the audit row. Every dial change writes a row to
<tenant>.dial_auditswithbefore/afterJSONB:docker compose exec postgres psql -U ratiba -d ratiba \-c "SET search_path TO tenant_<slug>; \SELECT dial, before, after, created_at \FROM dial_audits ORDER BY created_at DESC LIMIT 5;"You should see
dial=tone,before="warm"(or NULL on first change),after="professional". -
Inspect the singleton override row. Confirm NULL columns truly inherit the vertical default:
docker compose exec postgres psql -U ratiba -d ratiba \-c "SET search_path TO tenant_<slug>; \SELECT tone, greeting_mode, upsell, cancellation_tone, \honorific, cross_sell \FROM tenant_personality_config WHERE is_singleton = TRUE;"The five dials you didn't touch should be NULL — they're being resolved from
personality_defaults.yamlat every turn. -
Verify cache-invariant in traces. With Langfuse running, send two messages to two different tenants back-to-back. Open the trace for each call and inspect the
systemmessage field. They should be the byte-for-byte identical string. If they differ, a per-tenant value has leaked into the system message — file a bug againstanswer_shaper.py. -
Reset a dial to vertical default. PATCH with
{"dial": "tone", "value": null}(or theINHERIT_SENTINEL). Inspectdial_audits; theaftercolumn should show{"inherit": "vertical_default"}, not NULL. On the next turn,for_tenant()resolves the dial frompersonality_defaults.yaml.
Related
- Cross-sell — the
cross_sellandupselldials gate whether a cross-sell offer fires post-confirm. - Knowledge answers — follows the same user-template splice pattern; system message stays cache-eligible regardless of snippets injected.
- Voice conversation — the voice channel uses the same
shape_answer()path; prompt-cache eligibility carries over. - Catalog onboarding — at import time the relation inferrer populates the
service_relationsgraph that thecross_selldial unlocks. - Configuration — environment variables and per-tenant JSONB overrides for the thresholds that surround dial resolution.
- Observability — how to query
dial_auditsand trace per-tenantDialBundlevalues in structured logs. - Glossary — definitions for DialBundle, personality dials, per-vertical YAML defaults, answer shaper, prompt-cache invariant, user-template splice, dial audit.