The Standard
In 1996, Tim Berners-Lee and the HTTP working group published RFC 1945 — the specification for HTTP/1.0. It was 60 pages. It defined how a client requests a resource and how a server responds. It was simple, precise, and — critically — it was a standard.
Before HTTP, every company that wanted to serve documents over a network built its own protocol. Gopher, WAIS, Archie. Incompatible, duplicated, doomed. After HTTP, everyone built on the same foundation. The standard eliminated the wasted engineering effort and redirected it toward the applications that mattered.
In 2026, every company building production agent systems is building its own compliance infrastructure. Custom logging. Custom guardrails. Custom budget tracking. Custom verification. Incompatible, duplicated, doomed.
The compliance harness needs to become a standard.
The complete SOC-2 mapping
Every Trust Services Criterion that applies to agent systems maps to a specific harness subsystem. This is not a theoretical exercise — it is the document your SOC-2 auditor will ask for.
| SOC-2 Control | Requirement | Subsystem |
|---|---|---|
| CC4.1 | Ongoing evaluations | Witness — continuous quality sampling |
| CC5.1 | Accountability | Gate + Ledger — per-agent identity, per-action trail |
| CC6.1 | Logical access | Gate — policy engine with typed permissions |
| CC6.3 | Authorization | Gate — scope limits, human gates |
| CC6.8 | Prevent unauthorized software | Governor — agent spawning limits |
| CC7.2 | Monitor for anomalies | Ledger + Witness — cost anomaly + canaries |
| CC7.3 | Evaluate security events | Ledger — tamper-evident chain for forensics |
| CC7.4 | Incident response | Ledger — full trajectory replay for RCA |
| CC8.1 | Manage changes | Ledger — decision ledger tracks state transitions |
| CC9.1 | Risk mitigation | Governor — SLA decomposition, reliability targets |
| PI1 | Processing integrity | Witness — output verification |
When your auditor asks "how do you ensure processing integrity for AI-generated decisions?" — point to the Witness. When they ask "how do you prevent unauthorized actions?" — point to the Gate. When they ask "show me the evidence" — export from the Ledger.
The regulatory cross-reference
SOC-2 is not the only framework. Six regulatory bodies independently require the same four subsystems:
When seven independent regulatory and industry bodies — spanning continents, industries, and decades of practice — converge on the same four-subsystem architecture, that is not a design choice. That is a discovery of the minimum viable compliance structure for autonomous systems.
The SLA contract
Here is what a real SLA contract for an agent-powered system looks like when backed by the compliance harness:
// AVAILABILITY
target = 99.9% (8.76 hours downtime/year)
measurement = successful_completions / total_requests
// LATENCY
p50 = 30 seconds
p95 = 120 seconds
p99 = 300 seconds
// CORRECTNESS
target = 99.5% (human review of 5% sample, rolling 30 days)
// COST
per_request_cap = $5.00 (Governor-enforced)
monthly_cap = $10,000
// RELIABILITY ARCHITECTURE
per_stage_target = 99.98% (5 stages → 99.9% system)
verification = cross-model (Witness Layer 2)
retry_budget = 2 per stage
fallback = human escalation after retries exhausted
// INCIDENT RESPONSE
detection = <5 minutes (canary + anomaly detection)
root_cause_method = trajectory replay (Ledger export)
evidence_format = immutable hash chain
Every number in that contract is enforceable because every number is backed by a specific harness subsystem. The Governor enforces the cost cap. The Witness provides the correctness measurement. The Ledger supplies the evidence. The Gate controls access. This is what "production-grade" means.
The error budget model
Google's SRE discipline introduced error budgets: a quantified allowance for failure that balances reliability with velocity. The same model applies to agent systems.
Implementation roadmap
Four phases. Sixteen weeks. Each phase builds on the previous.
| Phase | Weeks | Build | Outcome |
|---|---|---|---|
| 1. Foundation | 1-4 | Ledger (trajectory + hash chain), Gate (basic allow/deny) | You can see what happened and control who does what |
| 2. Verification | 5-8 | Witness Layer 1 (deterministic), canary system, decision ledger | Errors are caught. State is persistent. |
| 3. Reliability | 9-12 | Witness Layer 2 (cross-model), SLA decomposition, retry architecture | You can offer measurable reliability guarantees |
| 4. Compliance | 13-16 | SOC-2 evidence export, Witness Layer 4 (sampling), human escalation | You can pass a SOC-2 audit |
Start with the Ledger. Without trajectory capture, you cannot debug, you cannot audit, you cannot improve. Everything else depends on visibility.
The inevitable architecture
In 1994, every web developer built their own HTTP handler. By 1997, Apache handled 60% of all web traffic. The standard won because building your own was waste.
In 2010, every team built their own deployment pipeline. By 2018, CI/CD was expected infrastructure. The standard won because building your own was waste.
In 2026, every team building production agent systems is building its own compliance harness. Custom logging. Custom guardrails. Custom budgets. Custom verification. All incompatible. All incomplete.
The compliance harness is not a product idea. It is the inevitable architecture that every company building production agent systems will arrive at independently — the same way every web application eventually needed HTTPS, every database eventually needed ACID, and every deployment eventually needed CI/CD.
The question is not whether this will be built.
The question is who names it first.
The HTTP spec was 60 pages. The SOLID principles were five sentences. The Gang of Four patterns were twenty-three names. The most enduring contributions to software engineering are not the largest. They are the ones that give the industry a shared vocabulary for solving a shared problem.
Gate. Ledger. Governor. Witness.
Four words. One architecture. The standard for building compliant agent systems.
The harness is the discipline.
The discipline is the moat.
Now you have the blueprint.
The Skill Stack by Anup Neupane
18 chapters · 7 named patterns · 5 AGENT principles · 4 harness subsystems · 47+ research citations
This book was researched, truth-extracted, and written using agent composition — the same architecture it teaches. The compliance harness described in Part VIII was audited by five parallel research agents across regulatory frameworks, reliability engineering, security enforcement, observability tooling, and cost governance.