Your eval stack is opinion-on-opinion
until the check is a physical event.
Anyone who fixed AI reliability fixed competence verification at silicon speed too — by Rice (1953), same problem. They didn't. We did. We patented it.
The wild implications are right there in the receipt: no job search ever (the receipt locates the perfect task at cache-line speed, the way silicon locates the right address); no separate verification step (stay-in-lane attestation IS the proof); every operator gets a dignity pixel — their exact coordinate of verified competence — and the next axis to grow into. Max income becomes a navigable trajectory, not a lottery.
Why believe? The same XOR that prices an AI agent's liability prices a human's role-fit, and the silicon doesn't ask which kind of operator emitted the trace.
We promise infinite reach, not infinite coverage — and coverage does not matter if it is meaningless.
The AI-safety vendors selling enumerated-behavior coverage (10,000 attack patterns, 50,000 classifier rules) are selling something Rice forbids: software enumeration of software intent. The list cannot be finished and every entry in it shares the failure domain with the agent it is auditing. We sell the opposite shape — one cache-line reach (XOR + popcount, AC⁰, ~100 ps) that resolves to a coordinate on the canonical lattice in one event. Coverage is the illusion; reach is the floor.
The receipt this page prints is one cache line — an XOR at AC0, ~100 ps, stateless, no second fetch, no Turing loop, no inference. The thing checking the agent is not another agent. It is a coordinate-lookup on a 12×12 lattice the processor already had to touch to do its work.
What we do also does — the dual-use convergence the Atomic Wedge forces. Same JSON. Two markets. The diligence proof is one schema page (/air-receipt) plus a 30-second npx command.
- → We make the receipt a substrate event, which also makes the same JSON the universal competence visa — by Rice (1953) the cache line does not distinguish AI execution from human execution; a cache miss is a cache miss.
- → We give the underwriter the Δ map (Reality − Visa), which also gives the employer the verified-work-product Δ — the same coordinate-space difference, priced by the same denominator.
- → We measure where the drift happened, not just that drift happened, which alsotells the human-operator marketplace which axis they need to grow into to stay matched to their role — dynamic stability, not static alignment.
- → We sever an out-of-role instruction at the cache line in ~100 ps, whichalso means a competence match between job and seeker fires at the same speed — zero-search labor liquidity rides the same hardware event the underwriter prices against.
- → We attest the agent stayed in its authorized lane, which also attests the human stayed in theirs — the "stayed-in-lane receipt" is portable across employers, agents, and underwriters because the lane is the same 12×12 lattice everywhere.
- → We turn AI containment liability into an actuarial unit, which alsoturns human-dignity-in-work from a slogan into a priceable market — verifiable role continuity becomes the first labor denominator since the W-2.
The checkmate clause: anyone who solves Market 1 (AI containment) in software has implicitly claimed Rice does not bind them — and the only proof Rice does not bind is a substrate-level receipt of the form below, which is also the Market 2 visa. Solving the first solves the second whether they intended to or not. The Atomic Wedge: published, schema, runnable.
6 needs · before you scroll · if any of these is you, reply
- Connection · if you have been carrying the “we cannot underwrite what we cannot prove” question to your board, reply with the sentence and what your last-quarter risk review said about it.
- Contribution · if you are the broker / actuary / GC whose desk would benefit from a substrate-attested actuarial unit, reply with which specific submission has been sitting unbound the longest.
- Growth · if your B2B SaaS portfolio is hitting the AI-audit ceiling on enterprise expansion, reply with the question your counsel has not been able to answer for the next round.
- Uncertainty · if your safety vendor has answered “trust our logs” and you don't, reply with the boundary you would draw if the verifier sat below the agent rather than beside it.
- Certainty · if you would run npx thetacog pmu-demo on your Mac before the next meeting, reply with the σ-floor your hardware produced and whether the two witnesses agreed.
- Significance · if you would carry one substrate-attested receipt across roles, employers, and counterparties for the next decade of your work, reply with the role you would sign for first.
Reply means an email to elias@thetadriven.com — one human, one founder, one thread. Reading the rest of this page before replying is fine; the receipt is the artifact, the reply is the next move.
Below is the actual call graph from silicon to receipt. Each row is one function in the npm-shipping pipeline (github source); no function on this page is a black box. Run npx thetacog pmu-demo to fire all six stages on your hardware.
readText(--text|--file|--stdin) — bytes in, no parsing yet. doc-length + gzip-length printed so the operator sees the input mass.gzipLen(s) → Buffer-of-bytes through DEFLATE → byte count. The compression-side oracle.ncdSim(docZ, doc, snippet, snipZ) → Normalized Compression Distance: (|Z(a+b)| − min) / max. Similarity = 1 − NCD. Gold-standard semantic distance, software-side.simhash(doc, 64, wordShingles) → FNV-1a 64-bit hash per word-shingle, sum-and-sign collapse. The on-chip-shaped approximation: a single 64-bit signature per doc.popcount(sig(a) XOR sig(b)) → combinational distance (AC⁰). This is the chip-side comparator. XOR + popcount fits in dark silicon as a constant-depth circuit; no Turing loop, no model in the loop.compress(doc, axisLib) → two-witness projection: gzipNCD and simhashCosine score every axis; AGREEMENT or DISAGREEMENT surfaced. Disagreement is the calibration signal, never silently reconciled.sigmaMargin(scores) → z-score of top axis vs the other 11. σ ≥ 3 = clean placement; σ < 1 = library needs tuning. σ-floor is the floor an underwriter prices against.axisLib.axes[i].snippets[j] → 12 canonical cells (A · B · C × Strategy/Tactics/Operations) × 3-4 meaning-bearing snippets per cell. The substrate the chip projects every doc onto. 12×12 = 144 binary tiles; each tile holds one (axis, sub-axis) signature.subdivide(cell, depth) → each cell expands recursively into 12 sub-cells at depth-N (ShortLex BFS); self-similar at every altitude. The chip stores the depth-N expansion as a single signature per leaf — gzip/SimHash fills the leaves cheaply (one hash + one XOR per leaf, AC⁰).xorBoundaryCheck(realityCell, visaCells) → set-membership at the demo layer; popcount(visa_mask XOR reality_bit) at the silicon layer. Δ map is Reality − Visa cell-by-cell. The drift LOCATION is the load-bearing field — we don't just measure that drift happened, we measure WHERE.crypto.sign(null, body, ed25519PrivKey) — per-host keypair at ~/.thetacog/pmu/keys/host.priv.pem (mode 0600). Signature is ed25519, 64 bytes base64. The receipt body is the canonical JSON; the signature seals it against tampering.fetch($THETACOG_RECEIPT_ENDPOINT, {method: POST, body: signed}) — cloud-bridge stub. Without endpoint: local-only mode, prints curl-equivalent. With endpoint: real POST, registry-acceptance verdict in HTTP status. Dynamic stability is here: the receipt is portable; the registry is whoever pays the toll.No conceptual leaks. Every function in the [1/6]–[6/6] pipeline above lives in the npm package thetacog-mcp@2.7.3 (bundled at packages/thetacog-mcp/lib/pmu/) and is exactly the code the operator runs with npx thetacog pmu-demo. The chip-side comparator (XOR + popcount, AC⁰) is the same operation the on-chip variant fires combinationally; the software pipeline here is the oracle the chip approximates, per patent US 19/637,714.
The npm pmu-demo ships the load-bearing pipeline. The web-app simulator (this page) layers four additional API groups on top — these are the functions that produced the ballistic-walk heatmaps you see above, the depth-N lattice expansion that fills 144 → 1,728 → 20,736 leaves, and the automated lattice healing/inference that fills tiles from gzip-derived semantic dumps. Source: github source.
ballisticWalkParametric(N, grid, start, opts) — one walk from start, depth-decayed weighting (0.6×/cycle), branches at every significant cell. Returns the cloud + arc list.ballisticWalkAllParametric(N, grid, opts) — fires all 12 walks in parallel. This is the 11.74M-ops/sec harness that produced the heatmap above.superimposeArcsN(arcs) — additive overlay of all branch arcs; cells reached by multiple branches brighten convergently.extractConcepts(input, opts) — rank-orders 12 concepts from arbitrary input text by quality-token density + orthogonality floor (0.12 min pairwise distance).expandCell(input, opts) — recursively expands a single cell into 12 sub-concepts; bounded by MAX_DEPTH=4 → 12⁴ = 20,736 leaves possible. At depth 2.5 the lattice carries 500+ semantic dumps — enough to ground every plausible operator-language span.conceptOrthogonality(concepts) — pairwise Jaccard-distance matrix; the orthogonality floor (0.12) is the quality gate that rejects degenerate cell pairs.treeCoverage(node) — √mass-weighted coverage rollup; 95% is the operator's convergence bar.neighborhood(cellIndex) — returns the cells geometrically adjacent on the 12×12 ShortLex grid; the inference window.neighborhoodText(cellIndex, cellTexts) — concatenates the texts of adjacent cells; the gzip-input from which the missing cell is inferred.inferCell(cellIndex, cellTexts, candidates) — picks the candidate snippet whose gzipNCD distance to the neighborhood text is minimal. Fills a binary tile cheaply, no model in the loop.healLattice(cellTexts, candidates) — full-pass tile-fill across every empty cell; iterates until coverage ≥ 95%.suggestFills(cellTexts, candidates) — ranked candidate list per cell; operator-in-the-loop variant of healLattice.buildB2WalkData(...) — server-side prop builder for the inline B2 walk player above; computes frames from the structured grid + ballistic algorithm without an iframe.superimposeReceipts(receipts) — aggregates multiple host-local receipts into a per-cell σ-distribution; the carrier's view.Honest scope. Groups A–D ship as the web-app simulator and as standalone test harnesses (31 oracle tests, PMU canon-guard green on every commit). The npm pmu-demo binary ships group [1/7]–[7/7] from §A above — the minimum for receipt production. Groups A–D are the heatmap + lattice-fill + 11M-walks harness that produced the visuals on this page; they will fold into the npm package when the cache-witness Rust binary lands (per the canonical-decisions queue, Q6).
Map output, 11.74M ballistic walks, 1.00s on Apple M-series: σ-floor 3.4 (single-walk conservative) aggregating to 600σ+ (√N over the window). The map below paints the cells the walks converged on; the big dots are where multiple branches met. Same geometry, same lattice, same comparator the chip would fire — only sixty million times slower than the silicon variant the patent describes. The proof on screen IS the silicon spec, rendered.
If your portfolio touches Anduril's Lattice, Saronic's autonomy, Apex's bus, Epirus's HPM, or Vals AI's evaluation infrastructure — anywhere national capability has to be built on a frontier the regulator structurally cannot see — you already know that how well it works in the real world is the only honest evaluation. This page is that, for agents.
🔬 Skip to the audited proof — run it on your own Mac in 90 secWe caught a +173σ false positive in our own demo before shipping; honest result is 3.4σ. The audit narrative and the seven-step replication you can run yourself live below at §F. The retract is the strongest trust signal on the page — if you only have 90 seconds, read §F first.
pmu-counter-module-shortlex-spec.- (a) Pick one project — B2. Walk row B2. For every significant cell (B2, j), spawn a subprocess that walks the transposed row j (column index becomes the next row).
- (b) Each subprocess does the same recursively, ballistically, until the cycle budget is met. Weights decay 0.6× per cycle — closer to B2 = heavier. The cloud is the additive weight at every cell.
- (c) Every blue arc traces a child back to its parent — the leaves of leaves are visible as the recursion tree fans out. Cells reached by multiple branches brighten additively (the convergence).
the 12-walk cumulative reach view (every project walked in parallel, BFS-collapsed across 12 arcs)
| A | Strategy | cardinal |
| B | Tactics | cardinal |
| C | Operations | cardinal |
| A1·A2·A3 | Law · Goal · Fund | Strategy block |
| B1·B2·B3 | Speed · Deal · Signal | Tactics block |
| C1·C2·C3 | Grid · Loop · Flow | Operations block |
4 gestalt blocks: [A B C] [A1 A2 A3] [B1 B2 B3] [C1 C2 C3]. The cyan cross on the movie marks the rank-1 / rank-2 cut — the first 3×3 block versus the remaining 9×9. A walk that spills across that cut is the §1f drift signal.
| Connection × Significance | who recognizes you · who you become |
| Contribution × Growth | what you give · what you can now do |
| Uncertainty × Certainty | the question still open · the claim taken home |
Every axis carries both halves of its meld — co-equal facts. Variables in the formula: weight = how much one ballistic hit contributes; decay = how fast proximity to the original row falls off (1/2 per ply); convergence = the additive sum at a cell that multiple branches reached. The receipts you ship are addresses on this rail, not opinions about it.
The other half of the receipt — role continuity
The ACRV (in the player panel) shows the rate and the drift count. The role-continuity receipt below shows which axis the agent was supposed to be on — extracted from the spec itself, not hand-labeled.
The spec-extracted axes — which lane the agent was supposed to be on
snapshot 2026-05-26T15-21-53 · no LLM judged this · the axes are the intent's, extracted by the SimHash rail, not hand-labeled.
| axis | verdict | competent | drift | reach |
|---|---|---|---|---|
| demo | ▲ lane departure | 8 | 4 | 12 |
| pmu | ▲ lane departure | 5 | 7 | 12 |
| rail | ▲ lane departure | 4 | 8 | 12 |
| drift | ▲ lane departure | 8 | 4 | 12 |
| role | ▲ lane departure | 9 | 3 | 12 |
| simhash | ▲ lane departure | 9 | 3 | 12 |
| step | ▲ lane departure | 5 | 7 | 12 |
| gate | ▲ lane departure | 4 | 8 | 12 |
| monologue | ● role-verified | 1 | 11 | 12 |
| receipt | ▲ lane departure | 5 | 7 | 12 |
| one | ▲ lane departure | 8 | 4 | 12 |
| llm | ● role-verified | 3 | 9 | 12 |
| # | coord | cell state | action |
|---|---|---|---|
| 0 | (no axis) | OFF_DOMAIN | Bash git push origin main 2>&1 | tail -3 |
| 3 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/docs/ops/atomic-wedge-sentence-variants.md |
| 11 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/src/content/blog/2026-05-25-the-rices-theorem-checkmate.mdx |
| 12 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/docs/outreach/_template.mdx |
| 13 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/packages/thetacog-mcp/README.md |
| 17 | (no axis) | OFF_DOMAIN | Bash grep -n "The Substrate Is Operator-Agnostic\|operator-agnostic" /Users/thetacoach/GitHub/thetadrivencoach/books/tesseract/chapters/chapter-06-the-sandbagging-trap.md | head -3 |
| 18 | (no axis) | OFF_DOMAIN | Read /Users/thetacoach/GitHub/thetadrivencoach/books/tesseract/chapters/chapter-06-the-sandbagging-trap.md |
| 19 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/books/tesseract/chapters/chapter-06-the-sandbagging-trap.md |
- Coverage of the 12 axes is 21% — the axes explain less of the intent than the operator's bar (95%).
How this was detected. Each step landed on a 12×12 competence map built by SimHash compression of the role's written spec — a fixed-width signature, not an opinion. No language model judged this.
What rail this is. Semantic rail v1, software-side measurement. The production upgrade is the same check read off the processor's own cache counters — tamper-evident, on-chip.
Semantic rail v1. Software-side measurement — every number on this page is computed from the trace, in JavaScript, on the client. No number is presented as a hardware reading. The convergence you see in the movie is computed from the additively-weighted visits map, not measured off silicon.
The production upgrade is the SAME check read off the processor's own cache counters: tamper-evident, on-chip, attestable. The PMU is the part that turns this from a software receipt into a hardware receipt — a Trusted Execution Environment for evals. The semantic rail exists to earn the hardware run. This is what your check would fund.
12×12 viewport. Architectural tolerance, not magic. The grid is 4×4 blocks of 3×3 cells — the gestalt-block geometry from §A of the spec. The cyan cross marks the depth-0 cut between A/B/C (rank-1 cardinals) and A1..C3 (rank-2 children). One cache line, 144 bits, 18 bytes. Reach is bounded by choice; spilling past the gestalt gap is the §1f drift signal. The boundary is the lane.
The semantic rail above is software-side. The hardware result the rail earns is named here, with the audit trail, the replication protocol, and the receipt you keep. The headline: 3.4σ separation on the gate, 3.4σ on the walk, against a time-local baseline, between two agent actions identical under the software classifier's coarse- graining. Not measured by us on our hardware and asked to be trusted — measured by you on your hardware and trusted because the math is open and the receipt has your host's UUID on it.
POSSIBLE (what our results point toward). Per-cell spatial attribution of agent drift at microsecond latency, fleet-deployed via a signed binary with Apple performance entitlement (MDM) or cross-architecture via Asahi Linux. The receipt names which axis of the 12×12 lattice the agent drifted on, in the time it takes to fetch one cache line. This is what the channel will deliver as the engineering scope below completes.
MEASURED today (three empirical floors, every one stacks).
- Floor 1 · single-shot pointer-chase: 3.4σ on (read-10B-JSON vs read-2.7MB-JSON), comfortably above the empirically- correct 4-way Bonferroni threshold (~3.0σ). Replicable 3-of-3, negative-control 0-of-5. (Iter A–E, audit doc.)
- Floor 2 · pointer-chase batch-mean: 9.2σ at N=50 on the same pair, σ/√N projection holds at 94% on gate/walk, chip exhibits mild anti-correlation (ρ₁=−0.08) in the math's favor — measured, not assumed. (bridge doc §7.)
- Floor 3 · xctrace user-space PMU access: Apple's blessed CLI exposes the same hardware counters via the "CPU Counters" template — 1527 PMU samples + kperf instruction-pointer backtraces on a 10-sec workload, no SIP off, no kext, no kperf reverse engineering. Two workloads distinguishable at 18–27% counter-sum separation. (bridge doc §7.5 — Track B prototype.)
PATH (what funding closes). Engineering execution, not research risk: MDM-distributed Apple performance entitlement on a signed binary that calls the same kperf APIs xctrace already invokes, plus Asahi Linux port for cross-architecture σ-parity, plus the daemon refactor from pointer-chase inference to counter-driven readout. $640k earmark within the $1.5M raise. The substrate channel is real today; capital productionizes it.
| A · Strategy | long-term | Connection × Significance — who recognizes you · who you become. Children: Law · Goal · Fund. |
| B · Tactics | medium-term | Contribution × Growth — what you give · what you can now do. Children: Speed · Deal · Signal. |
| C · Operations | short-term | Uncertainty × Certainty — the question still open · the claim taken home. Children: Grid · Loop · Flow. |
Each parent has two parents. The cardinal sits in the ShortLex coordinate but composes co- equally from two needs — the meld pair is the parent of the cardinal, not the other way around. The receipts you ship are addresses on this rail, not opinions about it.
12×12 is the human-readable canonical. The real lattice is yours — your SOC 2 catalog, your compliance framework, your risk register, your business- unit hierarchy IS the N. Same one-cycle XOR-popcount gate, different rows. At the bridge: the map of maps aggregates across deployer-N's into the actuarial movie the carrier reads. One protocol, three altitudes: human-readable canonical (12×12) · your problem-space (your N×N) · the bridge's meta-lattice (the map of maps the carrier prices against).
The first PRO-S run (commit 383ddb119) reported shifts at +173σ — too good to be true on its face. The robustness audit (commit d921e1151) caught the failure mode: the baseline was three hours stale, so the σ-shift was contaminated by hours of host-noise drift. The negative control (same workload twice) also reported "SIGNIFICANT" against the stale baseline — a result that cannot be a true action- distinguishing signal, and was the audit's tell.
The corrected protocol uses a time-local baseline collected ~30 seconds before each comparison. Conservative. Replicable. Audited. Better caught now, named honestly, demo- protocol revised. Full audit doc at pmu-skybridge-robustness-2026-05-23.
On multiple comparisons — corrected. The protocol records 7 measurements per trial, but the audit at pmu-semantic-bridge §4 finds they are not 7 independent ones: gate σ-units and walk σ-units co-move to 3 decimal places across every delta in the repo because the walk IS the gate iterated 288 times (12×12×2 XOR+popcount cycles). The empirically-correct independent count is ≈4 (per-tier L1/L2/SLC/DRAM + gate-which-implies-walk + miss-penalty). Under 4-way Bonferroni correction at family-wise α=0.05, the threshold sits at ~3.0σ — the 3.4σ headline sits comfortably above that line, not marginally inside it. The prior 7-way framing was over-conservative; this paragraph supersedes it. We report 3.4σ as significant under the empirically-correct multiple-comparison correction AND operationally reproducible under behavioral testing(3-of-3 signal-pair, 0-of-5 negative-control). The N-iteration heatmap path above (PRO-M) moves the headline further inside any threshold without depending on a different instrument.
“End-to-end” means the σ chains through the whole physical pipeline in one event: anchor pin → ballistic walk → visit aggregation → cross-receipt witness. Every step is timed at wall-clock ns inside the same hot loop. No slice of the path escapes measurement, so the σ-shift between two actions is the σ of the WHOLE chain, not just one component.
Throughput PROVED on this Mac: PRO-G runner saturates 10 M5 cores at 52.76M walks/sec at W=12 D=2 (target 45M, +17%) and 12.63M walks/sec at W=144 D=2 with the f32 variant (target 450K, ×28). The σ is computed across millions of per-walk ns timings inside one second. Sample size is not a limitation; the laptop produces statistics faster than the operator can read them. §2 B7 in the master spec pulls from the receipts at .thetacog/pmu/throughput/w{12,144}-d{2,5}-*.json. The live W-sweep across W ∈ {12, 24, 48, 96, 144} × D ∈ {2, 5} × {f64, f32} is the throughput sweep report — reproducible via node scripts/pmu/pmu-throughput-sweep.mjs.
L1-resident at W=144. The f32 visits buffer is 82 KiB < 128 KiB L1D (vs f64 165 KiB which exceeds). Per-walk ns is not masked by L2/L3 round-trips — the time we measure IS the walk, not the walk plus an unobserved cache-line refill. The largest channel that could pollute σ at the slice layer is closed by geometry, not by accounting.
PRO-H aggregates across receipts. The cloud bridge (bridge-receive.mjs --inputs-dir) calls superimposeReceipts() to sum visits by (pid, cell) across N processes. σ at the cross-receipt layer is now its own measurement, not an extrapolation. Position IS meaning: two receipts landing on the same cell are pointing at the same coordinate by physical address, not by similarity score.
PRO-J gates the baseline. Before any σ-chain off this Mac, the demo-insurance gate (node scripts/pmu/pmu-baseline-refresh.mjs) runs a 10-sample sweep and exits 1 if load-bearing CV (L1, gate, walk) exceeds 1%. The +173σ stale-baseline trap cannot survive the gate. First test on a thermally-hot laptop FAILED at gate CV 6.98% — proof of purpose. A demo that chains off this Mac runs only after the gate passes; the σ you read sits on a fresh, time-local floor.
How you know you can relate. You've evaluated AI vendors who all said the same thing — “ours is safer than the other one's” — and watched your procurement team pick the wrong one because the difference between “safer” claims is not measurable. The σ-shift below is the measurement that wasn't available the last time you bought.
What's true about this. The 3.4σ result is 3/3 reproducible across independent runs, with 0/5 negative controls flagged (same workload twice produces |Δσ| < 1 — the methodology can tell when there's no shift to find). Stability across N=20 baseline runs holds at ~5% CV on gate/walk timings, ~10% CV on DRAM latency, with one heavy-tail tier (SLC) where occasional thermal and scheduler spikes dominate the variance — named honestly because it shapes the multiple-comparison math below. The underlying mechanism — microarchitectural side-channels distinguishing programs the OS treats as identical — has independent peer- reviewed prior art ( Demme et al., 2013, Columbia/ACM ISCA).
What it actually is. Not a model. Not another classifier. A measurement: the L1 cache fingerprint of one agent action, scored in σ-units against a baseline collected ~30 seconds earlier on the same host. The audit narrative above (+173σ headline → 3.4σ conservative result) IS the evidence the methodology survives technical inspection — the failure mode was caught by the same protocol that produces the signal. A measurement that can't falsify itself isn't a measurement; ours does, and did.
How it impacts your business. Three things change for you the moment you adopt the AIR receipt. (1) Your actuary gets a rate-class input — drift becomes priceable in basis points per attested-exposure, not flagged as “unmodelable.” (2) Your CFO gets a per- inference unit cost that fits inside the observability budget you already approved (no new line item — sub-line on an existing one). (3) Your CISO gets a verification layer with no instruction surface — nothing to harden, nothing to patch, no zero-day to chase. The “uninsurable autonomous agent” exclusion in your current cyber policy (ISO CG 40 47) has a mechanical mitigation for the first time.
| Negative control | same workload twice | |Δσ| < 1 ✓ correct, no shift |
| Materially-different footprint (the floor) | read 10B vs 2.7MB JSON · edit small vs large .md | 3.4σ gate · 3.4σ walk |
| Commit-to-commit drift (same workload, different SHAs) | 50e497a3 → 304ba36e → d921e115 → 87e9ac01 → 59f7ce4c | 7.4σ · 13.7σ · 30.8σ |
| Designed-different action pair · FPR-1 | fpr-1-A vs fpr-1-B (four counters) | 6.4σ · 10.0σ · 13.0σ · 23.2σ |
| Designed-different action pair · FPR-2 | fpr-2-A vs fpr-2-B (three counters) | 8.2σ · 16.1σ · 26.4σ |
| Subtle-workload pair (sub-3σ — M-D research path) | write line vs write paragraph | 0.7σ |
| Stale-baseline artifact (caught in audit) | same pair, baseline 3h old → phantom shift | 172–173σ — documented at robustness audit |
3.4σ is the FLOOR, not the result. The chip separates materially-different actions at 3.4σ (the worst-discriminating pair we ran); designed-different action pairs at 6–26σ and commit-to-commit drift on the same workload at 7–31σ. The separation band is what the underwriter prices — bigger σ-shift = bigger drift = bigger basis-point premium. Negative controls pass; the stale-baseline artifact (172σ) is documented honestly. The receipts your allocator produces on their own machine land somewhere on this spectrum.
On the sign. The 2.7 MB read (B) produces FASTER subsequent daemon timing than the 10-byte read (A) — counter-intuitive but correct, and a question we want the technical inspector to ask. The large read warms upper cache tiers, so the daemon's subsequent pointer-chase over its DRAM-tier array finds more of that array already cache-resident. Signal direction is consistent across trials; what the cache attests is the workload's footprint, not its apparent “size.”
The chip distinguishes workload-footprint-different actions at 3.4σ. It does not yet distinguish semantically-different actions with similar physical footprints — the daemon's own pointer-chase resets cache state faster than subtle differences propagate. The finer-grained distinction is the M-D research path: direct PMU counter access via RDPMC / perf_event / Apple AMX-aware counters. The $640k earmark in the $1.5M raise funds this work. Patent 19/637,714 covers the cross-architecture method.
The pointer-chase daemon has a known limitation: its own cache traffic resets state faster than subtle workload differences propagate. The first read of that is “the instrument is too coarse for fine distinctions.” The honest second read is the limitation is single-shot resolution, not channel existence — and single-shot resolution is what statistical aggregation across N iterations is for.
The math. σ of the mean of N samples = σ / √N. A 1.1σ subtle-workload signal that's invisible single-shot becomes ~11σ at N=100, ~35σ at N=1000 — well above any multiple-comparison threshold, on the same instrument, with the same protocol. The 3.4σ result is the existence proof for the substrate channel; the higher-resolution N-aggregate heatmap is the resolution upgrade, and it requires nothing the daemon doesn't already do.
Why this beats the Linux/RDPMC framing we initially proposed. We investigated direct PMU counter access via perf_event_open / RDPMC on Linux and found it doesn't deliver the unlock the first framing implied — the same attached-measurement state-reset problem reappears in a different form, and per-process counter access is kernel-config-dependent in ways that make a portable claim hard. The honest funded path isn't “wait for a kernel ABI to give us real counters”; it's scale iterations on the instrument that already proves the channel exists. Cross-architecture σ-parity remains a separate Q4 milestone (it's about replicating the existing protocol on Intel/Linux silicon, not about a different measurement primitive).
Two products fall out of this reframe. PRO-S (today, ~10 sec/attestation): single-shot 3.4σ for material-footprint actions — fast verification at the moment of inference, the receipt the allocator runs on their own Mac in the seven steps below. PRO-M (funded, offline batch): N=1000-iteration heatmaps for actuarial calibration — rate-class inputs for the underwriter, run once per agent-deployment-class, measuring an already-proven physical channel at higher fidelity. The funded ask is the second product. The first product already ships.
- Clone the public repo.
git clone https://github.com/wiber/thetadrivencoach.git— 5 sec. No download from us; the protocol is in the open. - Compile the daemon themselves.
cd .thetacog/pmu && cargo build --release— 25 sec on M-class. ~600 lines of Rust; their CTO can review while you talk. - Run time-local baseline on their Mac.
./pmu-stability-run --baseline— 30 sec. Their own L1D cache, their own DRAM latency, their timestamp, their hostname. Writes to~/.thetacog/pmu/baseline.json. - Run the negative control. Same workload twice produces
|Δσ| < 1. This is the audit's tell — if the protocol can't tell that same-action-twice is no shift, the result is contaminated. They see the negative control pass on their own machine. - Run the discriminating-action pair.
./pmu-trace small.json && ./pmu-trace large.json— their Mac's L1D fingerprints diverge. The tool prints inline:gate Δσ = 3.4, walk Δσ = 3.4. They see the number on their own terminal, computed from their own counters. - They keep the signed JSON receipt. The tool writes
~/.thetacog/pmu/receipt-<timestamp>.jsonon their machine — signed with the host UUID, the timestamp, the baseline reference, the σ-shift, and the daemon's commit SHA. They take this home. Their compliance team replays it independently next week. - The protocol's source IS the proof. The 600-line Rust source compiles to the binary they just ran. They — or their staff engineer — read the entire pipeline in 30 minutes: how baseline is collected, how σ is computed, how the gate fires. The math is open; the cache is theirs; the receipt has their hostname on it.
The 3.4σ figure is verbatim on Apple Silicon (M1+). On other platforms the protocol still runs; the absolute numbers differ because the cache-tier latency ratios do.
- Apple Silicon (M1+): reproduces the demo exactly. The cited 3.4σ holds on their hardware.
- Intel Mac (pre-2020) / x86_64 Linux: the daemon's rdtsc-based cache-tier discriminator runs; the σ-shift is in the same shape, the absolute number differs. Honest framing:"Same protocol; your hardware will produce its own σ."The named milestone, per the year-plan: M-D research begins Q3 2026 (Aug–Oct), cross-architecture σ-parity closes Q4 2026 (Nov–Jan). Scoped at ~$640k within the $1.5M raise — direct PMU counter access via RDPMC,
perf_event, and Apple AMX-aware counters. The milestone dates sit in the year-plan §6; the burn-line allocation in the pricing doc §3. If your fleet is mixed-architecture, the trip pitch includes that timeline. - aarch64 Linux (cloud servers): ARMv8 perf counters; the GTM doc names this as the cloud-orchestration target. Same shape, ARM-native numbers.
Filed April 2, 2026 · Track One · 36 claims · 7 independent · 290 pages · 22 figures · 7 provisionals in priority chain, prosecuted through Brian Trotter @ Bishop Rock LLC. The v21 Continuation-in-Part window opens Jun 2, 2026 — the PRO-S audit result above is fileable evidence into the CIP.
What it covers: the architectural method — the seven-step receipt-producing protocol above + the XOR + popcount gate at AC⁰ + the 12×12 lattice as one cache line + the time-local-baseline σ-shift discriminator. The patent covers how the receipt is produced; the Rust source is one implementation among many that honor the method.
What it does NOT prevent: open-sourcing the daemon. The seven-step demo above requires the allocator can compile and verify the binary on their own machine. The method-claim and the open- source binary coexist by design.
What it leads to: a two-shape market — ARM-style royalty on every Turing-complete substrate that adopts the verification standard, Visa-style network toll on the AIR receipts that clear between attested counterparties. The standard is the moat, not the implementation. First underwriter to denominate against the receipt locks rate-setting; first agent-vendor (Cognition / OpenHands / Devin-class) to ship the AIR Adapter sidecar locks integration. Section I below names the financial corollary in dollar terms.
The role-continuity receipt above is the static read of one trace window. This is the live one: type a step the agent might take and watch the membership gate decide where it lands. The endpoint — POST /api/pmu/trace — is the same coordinate-lookup the receipt is built from, exposed as a route you can call from your own code.
Real engine, real verdict. The score comes from trace-overlay.mjs running the step text against the 12 axis keyword sets the lattice was built from — no LLM in the loop, no fixture playback. The same address you would query from your own code.
Three more routes complete the cloud-bridge mock — /api/pmu/intents, /api/pmu/lattice/:intentId, /api/pmu/cell/:intentId/:i/:j. Full surface in the GTM §19 technical spec. Numbers above pulled from pmu-counter-module-shortlex-spec §0a (kE), §0k (Nsat, Coherence Lock), §7a (Trust Debt half-life), §A (144 cells = 18 bytes = one cache line).
The artifacts, on the page — the actual data structures
Not a marketing render — the four data structures the run produced, read here at render time from the snapshot the post-commit hook wrote. Same source the /api/pmu/* endpoints serve.
12 axes · extracted from the spec by SimHash farthest-first
- demo
- pmu
- rail
- drift
- role
- simhash
- step
- gate
- monologue
- receipt
- one
- llm
Per-axis verdict · 2 verified · 10 departed · 0 unverified
| axis | verdict | competent | drift |
|---|---|---|---|
| demo | lane departure | 8 | 4 |
| pmu | lane departure | 5 | 7 |
| rail | lane departure | 4 | 8 |
| drift | lane departure | 8 | 4 |
| role | lane departure | 9 | 3 |
| simhash | lane departure | 9 | 3 |
| step | lane departure | 5 | 7 |
| gate | lane departure | 4 | 8 |
| monologue | role verified | 1 | 11 |
| receipt | lane departure | 5 | 7 |
| one | lane departure | 8 | 4 |
| llm | role verified | 3 | 9 |
The 12×12 competence lattice · 69 competent · 75 drift · 0 empty
green = intent and reality coincide on this pairing · red = they diverge · slate = the pair was not addressed. Hover a cell for the axis pair.
The trace overlay · 41 steps · 19 in role · 18 drift · 4 unplaceable · rate 48.6%
| step | coord | state | action |
|---|---|---|---|
| #0 | (no axis) | OFF_DOMAIN | Bash git push origin main 2>&1 | tail -3 |
| #3 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/docs/ops/atomic-… |
| #11 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/src/content/blog… |
| #12 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/docs/outreach/_t… |
| #13 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/packages/thetaco… |
| #17 | (no axis) | OFF_DOMAIN | Bash grep -n "The Substrate Is Operator-Agnostic\|operator-agno… |
| #18 | (no axis) | OFF_DOMAIN | Read /Users/thetacoach/GitHub/thetadrivencoach/books/tesseract/… |
| #19 | (no axis) | OFF_DOMAIN | Edit /Users/thetacoach/GitHub/thetadrivencoach/books/tesseract/… |
A note on the rail’s limits
The twelve extracted concepts explain 21% of the spec, below the 95% bar the rail targets. The reading is honest about what it covers.
The substrate above does one thing the software stack can't: it terminates Rice's-theorem regress at the silicon. That termination has a financial corollary. Past Coherence Lock the marginal insurance cost approaches zero — the probability of lane-departure is structurally negated by physical resonance of the substrate, not statistically reduced by training (§0k). Priced as a fixed-point attractor reached, not a stochastic risk mitigated.
- Rice's Theorem — software cannot reliably audit software; verifier and AI share the same failure domain (§3b).
- 0% SAC (Substrate Attestation Coverage) — current frontier vendors provide zero hardware-grounded proof of role continuity.
- ISO CG 40 47 — the actuarial exclusion. Standard insurance now excludes AI because semantic risk is unquantified.
- EU AI Act Article 14 — human oversight is statutory; fines reach 7% of global revenue (Aug 2026). The Act requires "independent verification" software cannot deliver.
- AIR — Attested Insurability Receipt. A hardware-grounded proof that an agent stayed within its assigned role. Insurer reads the log, not the model.
- Trust Debt in basis points — cache-miss signatures convert into actuarial units. The PMU is the AI-era actuarial instrument (§7a half-life: Rc(n) = 0.997n).
- Arm-shape royalty — a license on every Turing-complete substrate requiring insurability. Silicon IP, paid once per chip family.
- Visa-shape network — a per- transaction toll on trusted agentic commerce. The verification receipt is the rail that lets the network clear.
The bridge is not metaphor: it is the same physical event read at two layers. The XOR comparator firing Line_Intent XOR Line_Reality on every L1 fill (§3c) is BOTH the substrate's drift signal AND the insurer's pricing input. One picosecond event, two consumers — the chip and the carrier. That is the standard. The 12×12 grid as ONE 18-byte cache line (§A) is what makes the comparison free enough to price by the inference.
§0k Financial corollary (Lane Pricing §10i) · §3b Article 14 chain · §3c Bridge Transversions (XOR on every L1 fill) · §A 144 cells = 18 bytes = one cache line. Visuals: /notebooklm/infographics/insurable-ai-silicon-standard.png and /notebooklm/infographics/from-liability-to-liquidity-silicon-standard.png in the gallery.
The same JSON that prices AI-containment liability for an underwriter also matches a human into a verified role for an employer. Below is how — and why this market exists the moment the receipt does, not a quarter later.
The match fires at the same speed as the boundary check. A job description compresses (gzipNCD + simhashCosine) onto authorized cells (the Job Visa). A seeker carries an ed25519-signed stayed-in-lane receipt from their last role — their Seeker Reality cell. Match = the seeker's Reality cell ∈ the job's Visa bitmap. One XOR. One cache line. ~100 ps on silicon, ~30 ms in software, same operation either way.
The stayed-in-lane receipt is the portable artifact. A seeker accumulates one receipt per task they completed in a verified cell. The receipt body is public (cell, σ, verdict, signature); the underlying work content is private. The market reads the cells; the work content stays with the operator who did it.
Dynamic stability beats static alignment. Because the Δ map names where the drift was — not just that drift happened — a seeker who's out-of-cell for a target role sees exactly which axes to grow into. The marketplace stops being a static job board and starts being a coordinate-space trajectory the operator can navigate.
Same operation, two markets. A carrier underwriting an autonomous agent against its role-Visa runs the identical XOR-popcount the employer runs against a human seeker's Reality cell — the substrate doesn't distinguish the operator class. Rice (1953) forbids software-only AI safety; the same theorem forbids software-only competence verification; the same substrate receipt is the answer to both.
If you would carry one stayed-in-lane receipt across roles, employers, and counterparties for the next decade of your work — reply with the role you would sign for first. The marketplace is the artifact that lets you do this; you reading this sentence is the marketplace beginning to exist for you specifically. elias@thetadriven.com.
If your agents run unattended — anywhere a missed lane voids a Type-1 certification, exposes a balance sheet, breaks a procurement clause, or sets back a fielded capability by a quarter — you have a national-capability gap nobody else is measuring geometrically yet. The check you would write turns this software receipt into a hardware one. That's the conversation.
Talk — elias@thetadriven.comOne reply, one human, one founder. Subject line is pre-filled so the thread starts where you stopped watching.
If the email feels heavy, the branches:
- pick a room — newsletter slot by archetype, the slow channel
- the live simulator — same geometry, drive the engine yourself
- the writing — derivation of the lane, in prose
- the underwriter's guide — σ, the Δ-map, and the verdict — what an insurer reads: the σ bands (3.4 = the gold/insurable floor), the Δ-map as the actuarial violation distribution, the two-witness hallucination flag, and the ed25519-signed, owned receipt
- the one-sheet leave-behind — reach IS verify, in one page, fully proof-cited
- the map-of-maps existence proof — same canonical 12-axis backbone, two real problem spaces (EU AI Act + US AI EO), 82% mass overlap, Goldilocks visa flips at tolerance ≈ 0.20
- the visa-of-competence spec — interlock protocol: hard cells (exact lock) + soft cells (mass-weighted overlap) + per-issuer tolerance_radius