The Deterministic Envelope
On January 15, 2009, Captain Chesley "Sully" Sullenberger took off from LaGuardia Airport in a US Airways Airbus A320. Two minutes later, both engines were destroyed by a flock of Canada geese. What happened next — the "Miracle on the Hudson" — is one of the most studied events in aviation safety.
But the miracle wasn't Sully's skill. It was the system around Sully's skill.
First Officer Jeffrey Skiles immediately pulled the Quick Reference Handbook and began running the dual-engine-failure checklist — even though there was no time to complete it. Air Traffic Control offered vectors to three airports. The flight data recorder captured every decision, every input, every second of the 208-second flight. When the NTSB investigated afterward, they could reconstruct exactly what happened and exactly why every decision was made.
Aviation doesn't make pilots perfect. Pilots are human — they get tired, they misjudge, they make errors under pressure. What aviation does is wrap pilots in a deterministic envelope: checklists that enforce procedure. A co-pilot who independently verifies every critical decision. Air traffic control that monitors from outside. A flight data recorder that captures everything for post-incident analysis.
The pilot is probabilistic. The envelope is deterministic. The system is safe.
Agent systems need the same architecture.
The pattern that keeps emerging
Every safety-critical industry has independently discovered this pattern. Not by theory. By body count.
| Industry | Unreliable component | Deterministic envelope |
|---|---|---|
| Aviation | Pilot judgment | Checklists, co-pilot, ATC, flight data recorder |
| Nuclear | Reactor behavior | Containment vessel, redundant cooling, SCRAM systems |
| Finance | Trader decisions | Position limits, circuit breakers, four-eyes approval, audit trail |
| Healthcare | Clinical judgment | Checklists, second opinions, informed consent, medical records |
| Software | Developer code | Code review, CI/CD, type systems, automated tests |
| Agents | LLM reasoning | ? |
The last row is empty. That's the gap from Chapter 15. Here's how to fill it.
The compliance harness
The solution is not a new framework. It is not a better model. It is a compliance harness — an architectural layer that wraps agent systems with the deterministic enforcement, observability, and governance required for SOC-2 compliance and SLA guarantees.
The key insight:
You don't make the probabilistic component deterministic. You make the deterministic wrapper so tight that the system-level behavior is compliant even when individual agent decisions are not.
Four subsystems. Each solves a specific subset of the seven problems. Together, they provide the compliance envelope.
Why four — not three, not five
Every compliance requirement for agent systems maps to one of four concerns:
Who can do what? → The Gate. Access control, permission scoping, human approval gates, kill switches. Enforced structurally at the API layer — not suggested in the prompt.
What happened and why? → The Ledger. Complete trajectory capture, immutable audit trail, decision ledger for cross-agent state. The source code of agent systems. EU AI Act Article 12 makes this legally mandatory for high-risk AI.36EU AI Act, Regulation 2024/1689, Article 12 — Record-keeping. EUR-Lex
How much can it spend and how reliable must it be? → The Governor. Budget hierarchies, spawning depth limits, circuit breakers, SLA decomposition. The math that turns 95%-per-agent into 99.99%-per-stage.
Is the output actually correct? → The Witness. Independent verification with structurally different perspectives. Canary checks that detect compound cascades before they reach production. Statistical quality sampling with trend analysis.
Fewer than four leaves gaps. More than four creates overlap. This is the minimum viable compliance architecture — the same way three layers (contract, communication, orchestration) is the minimum viable composition architecture from Chapter 4.
The compliance harness doesn't make agents smarter. It makes agent systems trustworthy — auditable, predictable, bounded, and verifiable. The difference between "works in a demo" and "passes a SOC-2 audit" is exactly this infrastructure.
The convergence
This architecture is not invented. It is discovered.
Anthropic achieved SOC-2 Type II compliance for the Claude API. They built comprehensive logging of all API interactions, access controls for model endpoints, change management for model deployments, and incident response procedures. OpenAI built the same things. So did Salesforce for Agentforce. So did Microsoft for Azure AI.37Anthropic SOC-2 Type II. Anthropic · OpenAI Security: openai.com/security
Every company that achieved SOC-2 for AI built the same seven patterns:
| Pattern they all built | Harness subsystem |
|---|---|
| Comprehensive audit logging | Ledger |
| Role-based access control | Gate |
| Change management for prompts/configs | Gate |
| Real-time monitoring and alerting | Governor |
| Incident response procedures | Governor |
| Data protection and encryption | Ledger |
| Output validation and quality checks | Witness |
Four companies. Four independent implementations. The same architecture emerged every time. When multiple teams solving the same problem converge on the same solution — that's not a design choice. That's a discovery.
OWASP sees it too. Their Top 10 for Agentic Applications explicitly distinguishes between prompt-level controls (necessary but insufficient) and infrastructure-level controls (the critical layer). NIST AI RMF 1.0 maps to the same four functions: Govern, Map, Measure, Manage. ISO 42001 requires the same four categories of controls.38OWASP Top 10 for Agentic AI. OWASP · NIST AI RMF: NIST · ISO 42001: ISO
The next chapter shows how each wall works.
For each subsystem, assess whether you have any implementation at all. "Partial" counts as false — compliance auditors don't grade on effort.
// ASSESS YOUR ENVELOPE
gate = enforcement_level: harness | prompt | none
ledger = captures: full_trajectory | io_only | none
governor = budget_enforcement: per_agent | global | none
witness = verification_model: cross_model | adversarial | same | none
// Any "none" = compliance-blind. Any "prompt" = compliance-theater.