AccountMade
← The AccountMade blog
Engineering · Governance

Say It Once: How Governed Generation Actually Works

One approved claim, many buyer-facing surfaces, and a guarantee they can't quietly disagree. Here's the architecture behind it — the claim graph, the gates, and why the LLM authors but never governs.

JJJake Jinyong KimFounder, AccountMadeJuly 5, 2026
10 min read

Every company that sells to security, procurement, or legal reviewers has the same quiet problem: the same fact lives in a dozen places, and the copies drift. The uptime number in the pitch deck is 99.9%. The one in the security questionnaire is 99.95%. The trust page says "four nines." None of them are lies, exactly — they're just three people, three deadlines, and no single source. A reviewer who spots the gap doesn't think "stale copy." They think "which of these is true, and what else isn't?"

Our product promise is one line: say it once, and it's right everywhere. Approve a claim once, and every buyer-facing document — deck, questionnaire answer, one-pager, trust page — draws from that one approved fact. Update it, and they update with it. This post is about how that's actually enforced, because "single source of truth" is easy to say and hard to make true. The short version: an LLM authors the prose, and deterministic code governs every word of it.

I'm Jake — I build AccountMade's generation engine. What follows is the architecture we actually run in production, not an idealized diagram.

One approved claim generates a deck, a questionnaire answer, and a one-pager — each citing the same source.

One approved claim, three buyer-facing surfaces, one source — they render the same fact because they read the same record.

One graph, and generation is a projection

The foundation is that a claim is authored in exactly one place — a governed claim graph — and every surface is a transform over that graph, never an independent copy.

A claim isn't a string. It's a record that carries its whole governance state: the assertion itself, the source it came from (as a rename-safe reference, not a fragile label), the verbatim excerpt that supports it, who approved it and in what role, when it expires, whether it's been retired, and whether it has drifted — our term for a claim whose supporting evidence has since disappeared from the source. It also belongs to exactly one governed context — a product line, vertical, segment, or persona — so "our SLA" can legitimately differ between Enterprise and Starter without the two ever being treated as the same fact.

Because the deck bullet and the questionnaire answer are two renders of the same claim record, they can't say different numbers unless the graph itself holds a contradiction. That single design decision is what makes the rest of the guarantees possible — and it's why most of this post is about the checks that keep the graph honest.

"Say it once": one predicate decides what's usable

Nothing in the system reads raw claims. Every generation path — deck, document, questionnaire — funnels through a single predicate that decides whether a claim is generation-ready. A claim qualifies only if it is: not retired, not expired, not an unpromoted draft, not drifted, backed by a source that still exists, and non-empty.

This is the entire meaning of "approve once → usable everywhere," expressed as one chokepoint instead of a convention people are supposed to remember. Add a claim in the library, and it becomes available to every surface at once. Let it expire, and it silently drops out of all of them at once. There is no second place to update.

"Right everywhere" is enforced four ways

A single graph is necessary but not sufficient. The graph can still hold a bad claim, and an LLM asked to write a deck can still reach for a number that isn't backed. So generation runs through a sequence of gates. Here's the part worth internalizing: every one of these gates is deterministic code. No gate asks a model for a judgment. The model proposes; the gates dispose.

1. Cite-or-omit (structural). Before anything is composed, claims are filtered to those that are both approved and traceable — a claim reaches a slide only if it resolves to a real source or a real supporting excerpt. Everything else is dropped. Crucially, this runs before the composer, so untraced content isn't "flagged" — it's structurally impossible to render. The composer never sees it. We then assert that zero untraced claims survived; a nonzero count is a hard build failure, not a warning.

2. The pre-send state machine. At review and send time, every claim on a compiled document is recomputed live against the current library. A pure function maps each claim to a state — approved, stale (its block expired), out_of_context (it's bound outside the account's vertical, region, or NDA status), unsupported (no backing block), gap (a higher-risk claim with nothing behind it), or matched_unverified (needs a named reviewer). A second pure function turns state × risk into a decision: a gap or out_of_context claim is always a hard block; a critical claim bound to an expired block is a hard block; lower-risk staleness is a soft warning that ships only with a logged, reasoned override. The state is derived, not stored — so a claim that was fine last week and expired yesterday is caught today without anyone re-running anything.

3. Cross-surface consistency (the actual proof). This is the literal enforcement of "it's right everywhere." We maintain an index of exactly which claims each surface cited — every deck, document, and form answer for a deal. A deterministic check runs over the union of claims cited across all of a deal's surfaces and flags two things: internal contradictions (same subject, different numbers, or opposite polarity), and any cited claim that has since gone stale. In plain terms: what you pitch and what you attest are checked against each other, by claim ID. They cannot quietly diverge, because divergence is a check that runs — not a hope.

4. The rendered-deck quality gate. After the document is actually rendered, a final deterministic pass checks the output: every hero number must appear in an approved claim's text; rendered prose must trace back to the grounding corpus; internal scaffolding and boilerplate can never leak to a buyer. These provenance failures block on every tier and can't be overridden.

"Update it and they all update": the truth-change loop

When a source changes, we notice deterministically. Each source is content-hashed on every refresh; a changed hash — not a re-fetch, an actual content change — is what triggers the loop. From there, reverse-links fan out across three paths to mark every dependent document as stale, and a drift check re-examines the affected claims and stamps any whose support has vanished (which then makes them fail the generation-ready predicate above).

Here's where we'll be precise rather than promotional: rebuilding a stale document is opt-in per document, and when it does rebuild, it re-runs generation but carries the author's prior edits forward and re-stamps translations as stale so nothing silently strands in the wrong language. For documents that aren't opted into auto-rebuild, the change surfaces as a "sources updated" state and a one-click regenerate. We'd rather tell you exactly what moved than silently re-author a deck you were about to send.

The pattern: LLM authors, deterministic code governs

Step back and the architecture has a shape. An LLM appears in exactly three places: it drafts candidate claims out of your sources (which a human then approves), it authors the narrative prose when a document is compiled, and it serves as a final blind adversarial auditor. Everything that governs correctness — the readiness predicate, all four gates, the hashing, the stale-flagging — is deterministic code.

An LLM proposes — drafting claims, writing prose, running a blind audit — while deterministic code disposes: the readiness predicate, four gates, cross-surface consistency, and the staleness loop.

The model proposes; deterministic code disposes.

This is deliberate. Deterministic gates are inspectable, testable, and identical for every account. A model's judgment about whether two claims contradict is a coin flip you can't unit-test; a Jaccard-plus-numeric-signature comparison is a function with a fixed output you can pin down in CI. So we push all the authoring creativity to the model and keep all the governance judgment in code the model can't talk its way past.

New: catching contradictions at the source

The cross-surface consistency check (gate 3) is powerful, but it fires late — at send time, over the claims a document happens to cite. That leaves a window: a contradictory fact can enter the library and sit there, correct-looking, until some future send trips over it.

So we recently moved the same deterministic contradiction detector upstream, to the moment a claim is authored or edited. When you approve a new claim, we compare it against the existing generation-ready claims in the same governed context. If your new "99.95% uptime" disagrees with an approved "99.9% uptime" for the same product line, you find out then — at the single source — instead of three decks later.

Two properties matter here. First, it never blocks the write. Sellers legitimately assert both sides of an apparent contradiction for different audiences ("requires MFA for admins" and "does not require MFA for read-only users"), so the guard warns and records rather than walls — the same philosophy that governs our send gate, which deliberately blocks only genuine provenance and fabrication failures, never mere friction. Second, it's the same engine as the send-time check, just run earlier: one deterministic function, two points in the lifecycle. Every flag is written to the claim's audit trail, so the governance record is complete regardless of what the author decides.

What we deliberately don't do

Good architecture is as much about restraint as reach.

  • We don't put an LLM in the gates. The temptation is real — "just ask the model if this is faithful." We don't, because a gate you can't reproduce isn't a gate.
  • We don't auto-rewrite everything on every source change. That's expensive and, worse, it can silently change a document out from under someone. Opt-in rebuild plus a loud "this went stale" is the honest default.
  • We don't over-polish. For the security and procurement reviewers who actually receive these documents, accuracy, tailoring, and traceability carry the deal. An over-produced deck reads as less credible, not more. The bar is buyer-send-grade, not designer-grade.

The result is a system where "say it once, it's right everywhere" isn't a slogan on a marketing page — it's a set of deterministic checks you could read, test, and watch fail on bad input. The model writes the words. The code makes sure they're true, and stay true, everywhere they land.