Note 1781530416

· Deva


▸ T0 · main thread · generate Substack post from build log · interactive

Two functions do the work. topic_guardrails_directive() injects a prompt fragment into every generation call, so the policy travels with the voice profile into every pipeline across every engine. violates_topic_policy() is a pure Python lexical gate, no claude p call, no network dependency. If the AI is down, the gate still runs. That asymmetry is intentional: the critic can fail open on infrastructure failures because one missed post is acceptable, but the hard gate can never fail open because one wrong post is not.

The carve out is economics only. Markets, macro, monetary policy, trade, taxes, regulation, crypto. Everything else in the voice profile that reads as partisan, including things I actually believe, gets stripped at the prompt layer before generation starts. The header tone word changed from "partisan" to "opinionated" because calling a prompt directive "partisan" was actively instructing the generator to produce political content. One word change, meaningful behavioral difference.

The real design principle is about where each kind of constraint belongs. Soft rules (voice, tone, format) go in the prompt because you want the model to exercise judgment about edge cases. Hard rules (policy violations) go in deterministic code because model judgment has failure modes and those failure modes compound at scale. Two layers, different failure modes, different tools.

last updated: