Personality dials

What it does

Tenants tune how the agent sounds — without rewriting prompts. Per ADR-0010 D2–D4 + D9, six curated dials cover the entire surface area exposed in v1: Tone (warm / professional / playful), Greeting (default-bilingual / custom-string ≤ 140 chars), Upsell (never / suggest-once-after-confirm / suggest-during-slot-collection), Cancellation tone (forgiving / neutral / firm), Honorific (first-name / formal-en / formal-sw), and Cross-sell (never / related-only / full-suggest).

Defaults are vertical-aware. Eight verticals each carry an opinionated 6-dial block in app/prompts/personality_defaults.yaml. A dental clinic gets professional + formal-sw + never-upsell + firm-cancellation out of the box; a barbershop gets playful + first-name + suggest-once-after-confirm. Tenants override per-dial via the singleton row in <tenant>.tenant_personality_config; NULL columns inherit the vertical default. Every change writes a <tenant>.dial_audits row with before / after JSONB.

How it fits in the system

The hardest part of the architecture is the negative edge: there is no path from DialBundle into the system message. The tenant_personality kwarg on LLMRouter.complete is observability metadata only — by the time messages reaches the router, the per-tenant directives are already baked into the user template, and the system message is the same string for every tenant on the platform.

How it flows

The merge is per-dial, not whole-row. A tenant who overrides only tone gets the YAML defaults for the other five dials — so an admin changing one dial never accidentally clobbers the vertical's other defaults. NULL means "inherit", and INHERIT_SENTINEL = {"inherit": "vertical_default"} lands in the audit after column when an admin actively resets a dial back to the default (so support tooling can distinguish "never set" from "explicitly reset" without joining the YAML).

Dial reference

All six dials with their locked v1 value sets and the behavioural effect each value produces:

Dial	Values	Default (dental)	Effect
Tone	`warm` / `professional` / `playful`	`professional`	Controls the register of every agent utterance; `playful` enables contractions and emoji; `warm` adds empathic openers
Greeting	`default-bilingual` / `custom-<string>`	`default-bilingual`	Opening message of the first turn; custom string is ≤ 140 chars, tenant-authored, injected verbatim
Upsell	`never` / `suggest-once-after-confirm` / `suggest-during-slot-collection`	`never`	Controls when the agent may suggest a higher-value service; also gates cross-sell (see Cross-sell)
Cancellation tone	`forgiving` / `neutral` / `firm`	`firm`	Shapes the "I'm sorry to hear you want to cancel" branch; `firm` reminds about the policy upfront
Honorific	`first-name` / `formal-en` / `formal-sw`	`formal-sw`	How the agent addresses the customer; `formal-sw` uses "Bwana/Bibi" + surname; `formal-en` uses "Mr/Ms" + surname
Cross-sell	`never` / `related-only` / `full-suggest`	`never`	Whether a complementary-service offer fires post-confirm; `related-only` = complementary edges only; `full-suggest` reserved (behaves as `related-only` in v1)

Per-vertical YAML defaults

Every new tenant inherits a vertical-appropriate baseline from app/prompts/personality_defaults.yaml. Representative comparison:

Vertical	Tone	Upsell	Honorific	Cross-sell	Cancellation
dental	professional	never	formal-sw	never	firm
medical	professional	never	formal-sw	never	firm
legal	professional	never	formal-en	never	firm
physio	warm	never	formal-en	never	neutral
spa	warm	suggest-once-after-confirm	first-name	related-only	forgiving
salon	warm	suggest-once-after-confirm	first-name	related-only	neutral
barbershop	playful	suggest-once-after-confirm	first-name	related-only	forgiving
tutoring	warm	never	formal-en	never	neutral

Clinical verticals (dental, medical, legal) default to the strictest settings — no upsell, formal address, firm cancellation — because a patient being pushed to add a service feels ethically problematic. Lifestyle verticals (spa, salon, barbershop) can afford the softer, more commercial register. Tenants override any dial that doesn't fit their specific practice.

DialBundle

DialBundle is a frozen Pydantic dataclass carrying the six resolved dial values for a single turn. Its API surface:

DialBundle.to_template_vars() → dict[str, str] — renders each dial into a natural-language directive sentence. The output dict keys are personality_directive, greeting_directive, upsell_directive, cancellation_directive, honorific_directive, and cross_sell_directive.
Each value is a short English sentence (e.g. "Address the customer warmly as Bwana or Bibi [surname].") that the answer_shaper splices directly into the {personality_directive} slot in the user template.
DialBundle fields are Literal-typed and validated by pydantic.TypeAdapter at for_tenant() call time. Drift between a manually-edited YAML and the locked ADR-0010 D3 enum raises InvalidDialValueError at the first post-edit turn — not as a malformed prompt several turns later.

Dial audit

Every dial change — whether via the admin API or a direct DB write — writes a row to <tenant>.dial_audits with:

dial: which of the six dials changed
before: previous value JSONB (or NULL on first set)
after: new value JSONB, or {"inherit": "vertical_default"} if the admin actively reset to default

The sentinel value lets support tooling distinguish "this tenant never configured this dial" from "this tenant explicitly reset it to the default" without joining the YAML.

answer_shaper

AnswerShaper.shape_answer() (app/services/answer_shaper.py) is the service that assembles the final messages list for every LLM call across all inquiry and booking reply paths. It holds two responsibilities:

Splice per-tenant personality directives from DialBundle.to_template_vars() into the {personality_directive} (and sibling) slots in the user template — so the agent sounds like the tenant configured.
Splice knowledge snippets from raw_data["knowledge"] (Phase 0) into the same user template — so the agent answers tenant-specific questions from hand-seeded KB rows without a retrieval system.

Both splice operations land in the user message only. The system message is assembled once from answer_shaper.yaml and is never touched by either operation. This is the mechanism by which the prompt-cache invariant is preserved.

shape_answer() is the only caller of LLMRouter.complete() for the answer path; tenant_personality=vars is passed as a kwarg for observability (traces, cost attribution) but never re-read to construct the messages list.

Prompt-cache invariant

The prompt-cache invariant is the rule that the LLM system message must remain byte-identical across all tenants, all verticals, and all turns — so Anthropic's prompt-cache layer reuses the same cached prefix on every call rather than re-computing it.

Why it matters

Anthropic's prompt caching charges the full input-token price for the first call that primes the cache, then a discounted rate (~10 %) for every subsequent cache hit. The system message for answer_shaper is ~2 000 tokens. With tens of bookings per day across multiple tenants, a byte-identical system message means most calls hit the cache and the effective per-call cost drops by ~90 % for that prefix.

If per-tenant data leaked into the system message — even one character of a tenant's business name — each tenant would prime a separate cache entry and the savings would vanish.

How the invariant is enforced

Green path (system_message): assembled once from the YAML template; no tenant or turn data is ever interpolated; byte-identical for all callers.
Orange path (user template): assembled per-turn; carries all per-tenant content (personality directives, knowledge snippets, structured booking data, customer utterance). Cache does not cover this part — but it is also much smaller than the system message.

The tenant_personality kwarg on LLMRouter.complete() carries the same DialBundle values as observability metadata for tracing and cost attribution. It is never re-read to build the messages list — the directives are already in the user template by the time the router sees the call.

Pages that depend on this invariant

Knowledge answers — snippets are injected via raw_data["knowledge"] into the user template, following the same pattern as personality directives. See Knowledge answers.
Voice conversation — the voice channel calls shape_answer() on the same path; the system message is cache-eligible there too. See Voice conversation.

Where it lives in code

Concern	File	Key entry point
Dial bundle resolver	`backend/app/services/personality_config.py:328`	`PersonalityConfig.for_tenant()` (line 339)
DialBundle dataclass + render	`backend/app/services/personality_config.py:132`	`DialBundle.to_template_vars()` (line 153)
Override DAO	`backend/app/persistence/personality_config.py:71`	`fetch_personality_overrides()`
Audit + upsert DAO	`backend/app/persistence/personality_config.py:103`	`upsert_personality_override()`
LLMRouter signature	`backend/app/llm/router.py:130`	`LLMRouter.complete()` (`tenant_personality` kwarg, line 138)
AnswerShaper splice	`backend/app/services/answer_shaper.py:345`	`shape_answer()` (`personality_directive` render var, line 430)
Per-vertical defaults	`backend/app/prompts/personality_defaults.yaml`	8 verticals × 6 dials

The Literal-typed dials on DialBundle are validated via pydantic.TypeAdapter at resolution time — drift between the YAML and the locked ADR-0010 D3 enum surfaces as InvalidDialValueError at the first for_tenant() call after the bad write, not as a malformed prompt several turns later.

Decisions

ADR-0010 — Per-tenant personality + voice configuration (D2 dial set + D3 locked v1 enums + D4 per-vertical YAML defaults + D9 user-template splice cache invariant).

Try this on local dev

Set a dial via the admin API. With docker compose up -d running and a dev tenant onboarded, PATCH the dial endpoint and watch the next agent reply shift tone:
```
curl -i -X PATCH http://localhost:8010/admin/personality \
  -H "Authorization: Bearer <admin-jwt>" \
  -H "Content-Type: application/json" \
  -d '{"dial": "tone", "value": "professional"}'
```
Send a fresh inbound message on the dev WhatsApp number; the next agent reply should drop the casual register.

Inspect the audit row. Every dial change writes a row to <tenant>.dial_audits with before / after JSONB:

docker compose exec postgres psql -U ratiba -d ratiba \
  -c "SET search_path TO tenant_<slug>; \
      SELECT dial, before, after, created_at \
      FROM dial_audits ORDER BY created_at DESC LIMIT 5;"

You should see dial=tone, before="warm" (or NULL on first change), after="professional".

Inspect the singleton override row. Confirm NULL columns truly inherit the vertical default:

docker compose exec postgres psql -U ratiba -d ratiba \
  -c "SET search_path TO tenant_<slug>; \
      SELECT tone, greeting_mode, upsell, cancellation_tone, \
             honorific, cross_sell \
      FROM tenant_personality_config WHERE is_singleton = TRUE;"

The five dials you didn't touch should be NULL — they're being resolved from personality_defaults.yaml at every turn.

Verify cache-invariant in traces. With Langfuse running, send two messages to two different tenants back-to-back. Open the trace for each call and inspect the system message field. They should be the byte-for-byte identical string. If they differ, a per-tenant value has leaked into the system message — file a bug against answer_shaper.py.
Reset a dial to vertical default. PATCH with {"dial": "tone", "value": null} (or the INHERIT_SENTINEL). Inspect dial_audits; the after column should show {"inherit": "vertical_default"}, not NULL. On the next turn, for_tenant() resolves the dial from personality_defaults.yaml.

Cross-sell — the cross_sell and upsell dials gate whether a cross-sell offer fires post-confirm.
Knowledge answers — follows the same user-template splice pattern; system message stays cache-eligible regardless of snippets injected.
Voice conversation — the voice channel uses the same shape_answer() path; prompt-cache eligibility carries over.
Catalog onboarding — at import time the relation inferrer populates the service_relations graph that the cross_sell dial unlocks.
Configuration — environment variables and per-tenant JSONB overrides for the thresholds that surround dial resolution.
Observability — how to query dial_audits and trace per-tenant DialBundle values in structured logs.
Glossary — definitions for DialBundle, personality dials, per-vertical YAML defaults, answer shaper, prompt-cache invariant, user-template splice, dial audit.

What it does​

How it fits in the system​

How it flows​

Dial reference​

Per-vertical YAML defaults​

DialBundle​

Dial audit​

answer_shaper​

Prompt-cache invariant​

Why it matters​

How the invariant is enforced​

Pages that depend on this invariant​

Where it lives in code​

Decisions​

Try this on local dev​

Related​