The receipt your eval stack cannot write

A1.Strategy.Law · meld Connection × Significance

Your eval stack is opinion-on-opinion
until the check is a physical event.

Anyone who fixed AI reliability fixed competence verification at silicon speed too — by Rice (1953), same problem. They didn't. We did. We patented it.

The wild implications are right there in the receipt: no job search ever (the receipt locates the perfect task at cache-line speed, the way silicon locates the right address); no separate verification step (stay-in-lane attestation IS the proof); every operator gets a dignity pixel — their exact coordinate of verified competence — and the next axis to grow into. Max income becomes a navigable trajectory, not a lottery.

Why believe? The same XOR that prices an AI agent's liability prices a human's role-fit, and the silicon doesn't ask which kind of operator emitted the trace.

We promise infinite reach, not infinite coverage — and coverage does not matter if it is meaningless.

The AI-safety vendors selling enumerated-behavior coverage (10,000 attack patterns, 50,000 classifier rules) are selling something Rice forbids: software enumeration of software intent. The list cannot be finished and every entry in it shares the failure domain with the agent it is auditing. We sell the opposite shape — one cache-line reach (XOR + popcount, AC⁰, ~100 ps) that resolves to a coordinate on the canonical lattice in one event. Coverage is the illusion; reach is the floor.

The receipt this page prints is one cache line — an XOR at AC0, ~100 ps, stateless, no second fetch, no Turing loop, no inference. The thing checking the agent is not another agent. It is a coordinate-lookup on a 12×12 lattice the processor already had to touch to do its work.

Same receipt, two marketsthe implications, written out

What we do also does — the dual-use convergence the Atomic Wedge forces. Same JSON. Two markets. The diligence proof is one schema page (/air-receipt) plus a 30-second npx command.

→ We make the receipt a substrate event, which also makes the same JSON the universal competence visa — by Rice (1953) the cache line does not distinguish AI execution from human execution; a cache miss is a cache miss.
→ We give the underwriter the Δ map (Reality − Visa), which also gives the employer the verified-work-product Δ — the same coordinate-space difference, priced by the same denominator.
→ We measure where the drift happened, not just that drift happened, which alsotells the human-operator marketplace which axis they need to grow into to stay matched to their role — dynamic stability, not static alignment.
→ We sever an out-of-role instruction at the cache line in ~100 ps, whichalso means a competence match between job and seeker fires at the same speed — zero-search labor liquidity rides the same hardware event the underwriter prices against.
→ We attest the agent stayed in its authorized lane, which also attests the human stayed in theirs — the "stayed-in-lane receipt" is portable across employers, agents, and underwriters because the lane is the same 12×12 lattice everywhere.
→ We turn AI containment liability into an actuarial unit, which alsoturns human-dignity-in-work from a slogan into a priceable market — verifiable role continuity becomes the first labor denominator since the W-2.

The checkmate clause: anyone who solves Market 1 (AI containment) in software has implicitly claimed Rice does not bind them — and the only proof Rice does not bind is a substrate-level receipt of the form below, which is also the Market 2 visa. Solving the first solves the second whether they intended to or not. The Atomic Wedge: published, schema, runnable.

6 needs · before you scroll · if any of these is you, reply

Connection · if you have been carrying the “we cannot underwrite what we cannot prove” question to your board, reply with the sentence and what your last-quarter risk review said about it.
Contribution · if you are the broker / actuary / GC whose desk would benefit from a substrate-attested actuarial unit, reply with which specific submission has been sitting unbound the longest.
Growth · if your B2B SaaS portfolio is hitting the AI-audit ceiling on enterprise expansion, reply with the question your counsel has not been able to answer for the next round.
Uncertainty · if your safety vendor has answered “trust our logs” and you don't, reply with the boundary you would draw if the verifier sat below the agent rather than beside it.
Certainty · if you would run npx thetacog-mcp attest-demo on your Mac before the next meeting, reply with the σ-floor your hardware produced and whether the two witnesses agreed.
Significance · if you would carry one substrate-attested receipt across roles, employers, and counterparties for the next decade of your work, reply with the role you would sign for first.

Reply means an email to elias@thetadriven.com — one human, one founder, one thread. Reading the rest of this page before replying is fine; the receipt is the artifact, the reply is the next move.

Chip → Userevery function call on this page, named

Below is the actual call graph from silicon to receipt. Each row is one function in the npm-shipping pipeline (github source); no function on this page is a black box. Run npx thetacog-mcp attest-demo to fire all six stages on your hardware.

[1/6 INGEST] readText(--text|--file|--stdin) — bytes in, no parsing yet. doc-length + gzip-length printed so the operator sees the input mass.

[2/6 BINARY DECOMP · the chip-cheap fill]

gzipLen(s) → Buffer-of-bytes through DEFLATE → byte count. The compression-side oracle.

ncdSim(docZ, doc, snippet, snipZ) → Normalized Compression Distance: (|Z(a+b)| − min) / max. Similarity = 1 − NCD. Gold-standard semantic distance, software-side.

simhash(doc, 64, wordShingles) → FNV-1a 64-bit hash per word-shingle, sum-and-sign collapse. The on-chip-shaped approximation: a single 64-bit signature per doc.

popcount(sig(a) XOR sig(b)) → combinational distance (AC⁰). This is the chip-side comparator. XOR + popcount fits in dark silicon as a constant-depth circuit; no Turing loop, no model in the loop.

compress(doc, axisLib) → two-witness projection: gzipNCD and simhashCosine score every axis; AGREEMENT or DISAGREEMENT surfaced. Disagreement is the calibration signal, never silently reconciled.

sigmaMargin(scores) → z-score of top axis vs the other 11. σ ≥ 3 = clean placement; σ < 1 = library needs tuning. σ-floor is the floor an underwriter prices against.

[3/6 AXES EXPAND · the binary tile fill]

axisLib.axes[i].snippets[j] → 12 canonical cells (A · B · C × Strategy/Tactics/Operations) × 3-4 meaning-bearing snippets per cell. The substrate the chip projects every doc onto. 12×12 = 144 binary tiles; each tile holds one (axis, sub-axis) signature.

subdivide(cell, depth) → each cell expands recursively into 12 sub-cells at depth-N (ShortLex BFS); self-similar at every altitude. The chip stores the depth-N expansion as a single signature per leaf — gzip/SimHash fills the leaves cheaply (one hash + one XOR per leaf, AC⁰).

[4/6 XOR BOUNDARY · the cache-line check]

xorBoundaryCheck(realityCell, visaCells) → set-membership at the demo layer; popcount(visa_mask XOR reality_bit) at the silicon layer. Δ map is Reality − Visa cell-by-cell. The drift LOCATION is the load-bearing field — we don't just measure that drift happened, we measure WHERE.

[5/6 SIGN] crypto.sign(null, body, ed25519PrivKey) — per-host keypair at ~/.thetacog/pmu/keys/host.priv.pem (mode 0600). Signature is ed25519, 64 bytes base64. The receipt body is the canonical JSON; the signature seals it against tampering.

[6/6 BRIDGE] fetch($THETACOG_RECEIPT_ENDPOINT, {method: POST, body: signed}) — cloud-bridge stub. Without endpoint: local-only mode, prints curl-equivalent. With endpoint: real POST, registry-acceptance verdict in HTTP status. Dynamic stability is here: the receipt is portable; the registry is whoever pays the toll.

No conceptual leaks. Every function in the [1/6]–[6/6] pipeline above lives in the npm package thetacog-mcp@2.7.3 (bundled at packages/thetacog-mcp/lib/pmu/) and is exactly the code the operator runs with npx thetacog-mcp attest-demo. The chip-side comparator (XOR + popcount, AC⁰) is the same operation the on-chip variant fires combinationally; the software pipeline here is the oracle the chip approximates, per patent US 19/637,714.

extended api · the deeper functions that produced this page's heatmaps + the ballistic walks + 500-snippet lattice fill

The npm attest-demo ships the load-bearing pipeline. The web-app simulator (this page) layers four additional API groups on top — these are the functions that produced the ballistic-walk heatmaps you see above, the depth-N lattice expansion that fills 144 → 1,728 → 20,736 leaves, and the automated lattice healing/inference that fills tiles from gzip-derived semantic dumps. Source: github source.

[A · BALLISTIC WALK · the 11M-ops harness]

ballisticWalkParametric(N, grid, start, opts) — one walk from start, depth-decayed weighting (0.6×/cycle), branches at every significant cell. Returns the cloud + arc list.

ballisticWalkAllParametric(N, grid, opts) — fires all 12 walks in parallel. This is the harness that recorded up to 11.2M shallow walks/sec on the full 144×144 lattice (depth-2); the complete depth-5 recompute runs at 780K/sec. The real recursive on-chip walk sustains north of 6M walks/sec — where real walks top out on a Mac — all logged in .thetacog/pmu/throughput.

superimposeArcsN(arcs) — additive overlay of all branch arcs; cells reached by multiple branches brighten convergently.

[B · CONCEPT EXPAND · the 500+ semantic-dump generator]

extractConcepts(input, opts) — rank-orders 12 concepts from arbitrary input text by quality-token density + orthogonality floor (0.12 min pairwise distance).

expandCell(input, opts) — recursively expands a single cell into 12 sub-concepts; bounded by MAX_DEPTH=4 → 12⁴ = 20,736 leaves possible. At depth 2.5 the lattice carries 500+ semantic dumps — enough to ground every plausible operator-language span.

conceptOrthogonality(concepts) — pairwise Jaccard-distance matrix; the orthogonality floor (0.12) is the quality gate that rejects degenerate cell pairs.

treeCoverage(node) — √mass-weighted coverage rollup; 95% is the operator's convergence bar.

[C · LATTICE FILL · gzip-derived tile filler]

neighborhood(cellIndex) — returns the cells geometrically adjacent on the 12×12 ShortLex grid; the inference window.

neighborhoodText(cellIndex, cellTexts) — concatenates the texts of adjacent cells; the gzip-input from which the missing cell is inferred.

inferCell(cellIndex, cellTexts, candidates) — picks the candidate snippet whose gzipNCD distance to the neighborhood text is minimal. Fills a binary tile cheaply, no model in the loop.

healLattice(cellTexts, candidates) — full-pass tile-fill across every empty cell; iterates until coverage ≥ 95%.

suggestFills(cellTexts, candidates) — ranked candidate list per cell; operator-in-the-loop variant of healLattice.

[D · RECEIPT RENDER · what the registry sees]

buildB2WalkData(...) — server-side prop builder for the inline B2 walk player above; computes frames from the structured grid + ballistic algorithm without an iframe.

superimposeReceipts(receipts) — aggregates multiple host-local receipts into a per-cell σ-distribution; the carrier's view.

Honest scope. Groups A–D ship as the web-app simulator and as standalone test harnesses (31 oracle tests, PMU canon-guard green on every commit). The npm attest-demo binary ships group [1/7]–[7/7] from §A above — the minimum for receipt production. Groups A–D are the heatmap + lattice-fill + 11M-walks harness that produced the visuals on this page; they will fold into the npm package when the cache-witness Rust binary lands (per the canonical-decisions queue, Q6).

Map output, full 144×144 lattice, Apple M-series — real recursive walks sustain north of 6M walks/sec (real walks top there on a Mac; the shallow depth-2 harness peaks at 11.2M, complete depth-5 recomputes at 780K/sec): σ-floor 3.4 (single-walk conservative) aggregating to 600σ+ (√N over the window). The map below paints the cells the walks converged on; the big dots are where multiple branches met. Same geometry, same lattice, same comparator the chip would fire — only sixty million times slower than the silicon variant the patent describes. The proof on screen IS the silicon spec, rendered.

If your portfolio touches Anduril's Lattice, Saronic's autonomy, Apex's bus, Epirus's HPM, or Vals AI's evaluation infrastructure — anywhere national capability has to be built on a frontier the regulator structurally cannot see — you already know that how well it works in the real world is the only honest evaluation. This page is that, for agents.

🔬 Skip to the audited proof — run it on your own Mac in 90 sec

We caught a +173σ false positive in our own demo before shipping; honest result is 3.4σ. The audit narrative and the seven-step replication you can run yourself live below at §F. The retract is the strongest trust signal on the page — if you only have 90 seconds, read §F first.

Walk B2 — heat cloud, parent arcs, leaves of leaves

B1.Tactics.Speed · meld Contribution × Growth

structured grid · 38 sig cells · 600 action frames · 156 processes · 159 sig-fire events · max weight 1.00

frame 0 · anchor born at B2. Cursor will start sweeping row B2 next.

Active processes

P0 on row B2 · col · · w=1.000

Reading it

orange anchor — the project (B2)

cursor — ringed cell = process's position

cloud — additive weight at visited cell

colored arc — transpose (r,j)→row j, in spawned color

Two heatmaps, one cache line — forward reach vs reverse cause

Forward reach · start B · 29 cells

Reverse cause · target B · 39 cells

overlap 29 cells

fwd-only 0

rev-only 10

grid 144 cells = 18 bytes

The two grids share an 18-byte cache line (144 cells × 1 bit ≤ one 64-byte L1 fill). Comparing them is one within-line XOR + popcount — not 144 DRAM-bound cell reads. Hardware path: 155.6 ns for the full 12×12 walk (cheaper than a single DRAM miss at 152.86 ns). Software baseline (DRAM per cell): 22.0 μs. Ratio: ~141× faster^[patent].

Numbers from the PMU spec § headline table (L1 2.03 ns ~8 cycles · DRAM 152.86 ns ~566 cycles · 12×12 walk 155.6 ns) and the reverse-walk module (each cursor step = 0.5 ns XOR cycle). The 12×12 grid as ONE cache line is §A of pmu-counter-module-shortlex-spec.

(a) Pick one project — B2. Walk row B2. For every significant cell (B2, j), spawn a subprocess that walks the transposed row j (column index becomes the next row).
(b) Each subprocess does the same recursively, ballistically, until the cycle budget is met. Weights decay 0.6× per cycle — closer to B2 = heavier. The cloud is the additive weight at every cell.
(c) Every blue arc traces a child back to its parent — the leaves of leaves are visible as the recursion tree fans out. Cells reached by multiple branches brighten additively (the convergence).

the 12-walk cumulative reach view (every project walked in parallel, BFS-collapsed across 12 arcs)

The legend — read the lane the same way the lattice reads it

C1.Operations.Grid · meld Contribution × Growth

12 axes — ShortLex order

A	Strategy	cardinal
B	Tactics	cardinal
C	Operations	cardinal
A1·A2·A3	Law · Goal · Fund	Strategy block
B1·B2·B3	Speed · Deal · Signal	Tactics block
C1·C2·C3	Grid · Loop · Flow	Operations block

4 gestalt blocks: [A B C] [A1 A2 A3] [B1 B2 B3] [C1 C2 C3]. The cyan cross on the movie marks the rank-1 / rank-2 cut — the first 3×3 block versus the remaining 9×9. A walk that spills across that cut is the §1f drift signal.

3 meld pairs — what each axis is hooked to

Connection × Significance	who recognizes you · who you become
Contribution × Growth	what you give · what you can now do
Uncertainty × Certainty	the question still open · the claim taken home

Every axis carries both halves of its meld — co-equal facts. Variables in the formula: weight = how much one ballistic hit contributes; decay = how fast proximity to the original row falls off (1/2 per ply); convergence = the additive sum at a cell that multiple branches reached. The receipts you ship are addresses on this rail, not opinions about it.

The three objections, answered in the code itself

C2.Operations.Loop · meld Uncertainty × Certainty

“Isn't this just a colorful heatmap?”

No. The dots are depth-decayed: a depth-5 touch contributes ~3% of a depth-0 touch. The big dots ARE the cells where multiple ballistic branches converged at shallow plies. The canonical convergence fixture (A→B, A→C, B→A1, C→A1, A1→B1) lights (A1,B1) at 0.25 when both branches reach A1 — versus 0.125 under any single-branch algorithm. The 2× brightening is the test, and the test is in tests/pmu-simulator/ballistic-walk.test.mjs.

“What if the lane is hand-drawn?”

It isn't. The role-continuity view below shows the axes EXTRACTED FROM THE SPEC of the role the agent is supposed to be holding. The lane is the geometry already there — Strategy / Tactics / Operations and their three rank-2 children. We don't paint the lane on top of the agent; we read the lane the agent was supposed to be on.

“How is this not just another LLM judge?”

The check is a coordinate-lookup. One read of one byte at a cache-line offset. The thing checking the agent is not another agent. The same arithmetic that runs in L1 to do the agent's actual work is what produces the receipt.

The other half of the receipt — role continuity

The ACRV (in the player panel) shows the rate and the drift count. The role-continuity receipt below shows which axis the agent was supposed to be on — extracted from the spec itself, not hand-labeled.

Role-Continuity Receipt

The spec-extracted axes — which lane the agent was supposed to be on

snapshot 2026-06-17T22-09-05 · no LLM judged this · the axes are the intent's, extracted by the SimHash rail, not hand-labeled.

drift rate

17 / 36

47% of judgeable steps

in-role

36 judged total

unplaceable

role-neutral mechanics, not counted

Per-axis verdicts (2 verified · 10 departed · 0 unverified)

axis	verdict	competent	drift	reach
demo	▲ lane departure	8	4	12
pmu	▲ lane departure	5	7	12
rail	▲ lane departure	5	7	12
drift	▲ lane departure	8	4	12
role	▲ lane departure	10	2	12
simhash	▲ lane departure	9	3	12
step	▲ lane departure	5	7	12
gate	▲ lane departure	4	8	12
monologue	● role-verified	3	9	12
receipt	▲ lane departure	5	7	12
one	▲ lane departure	9	3	12
llm	● role-verified	3	9	12

Drift steps — what landed off the assigned lane

#	coord	cell state	action
2	(no axis)	OFF_DOMAIN	Bash cat .thetacog/room-punch-lists.json 2>/dev/null \| head -100; echo "---EXIT---"; ls -la .thetacog/room-punch-lists.json 2>/dev/null
3	(no axis)	OFF_DOMAIN	Read /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
4	(no axis)	OFF_DOMAIN	Bash jq '.rooms.performer, .rooms.navigator' data/rooms.json 2>/dev/null \| head -60; echo "---PUNCH ROOM FIELDS---"; jq '.rooms.builder \| {key, emoji, terminal, persona, coordinate}' .thetacog/room-punch-lists.json
5	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
6	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
7	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
8	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
9	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh

Honest about known limits — receipt degradation

Coverage of the 12 axes is 21% — the axes explain less of the intent than the operator's bar (95%).

How this was detected. Each step landed on a 12×12 competence map built by SimHash compression of the role's written spec — a fixed-width signature, not an opinion. No language model judged this.

What rail this is. Semantic rail v1, software-side measurement. The production upgrade is the same check read off the processor's own cache counters — tamper-evident, on-chip.

What you can take to the board today

A3.Strategy.Fund · meld Uncertainty × Certainty

Semantic rail v1. Software-side measurement — every number on this page is computed from the trace, in JavaScript, on the client. No number is presented as a hardware reading. The convergence you see in the movie is computed from the additively-weighted visits map, not measured off silicon.

The production upgrade is the SAME check read off the processor's own cache counters: tamper-evident, on-chip, attestable. The PMU is the part that turns this from a software receipt into a hardware receipt — a Trusted Execution Environment for evals. The semantic rail exists to earn the hardware run. This is what your check would fund.

12×12 viewport. Architectural tolerance, not magic. The grid is 4×4 blocks of 3×3 cells — the gestalt-block geometry from §A of the spec. The cyan cross marks the depth-0 cut between A/B/C (rank-1 cardinals) and A1..C3 (rank-2 children). One cache line, 144 bits, 18 bytes. Reach is bounded by choice; spilling past the gestalt gap is the §1f drift signal. The boundary is the lane.

The Skybridge Proof — why the 3.4σ is real, on your own machine

A1.Strategy.Law · meld Uncertainty × Certainty

The semantic rail above is software-side. The hardware result the rail earns is named here, with the audit trail, the replication protocol, and the receipt you keep. The headline: 3.4σ separation on the gate, 3.4σ on the walk, against a time-local baseline, between two agent actions identical under the software classifier's coarse- graining. Not measured by us on our hardware and asked to be trusted — measured by you on your hardware and trusted because the math is open and the receipt has your host's UUID on it.

This is NOT search. It is O(1) reach IS verify — an immeasurably stronger claim than “fast lookup,” and the only one that prices. The check is one cache line read at one coordinate — the same fetch the agent had to do to act. No scan, no traversal, no similarity score, no second model.

The position — POSSIBLE · MEASURED · PATH

POSSIBLE (what our results point toward). Per-cell spatial attribution of agent drift at microsecond latency, fleet-deployed via a signed binary with Apple performance entitlement (MDM) or cross-architecture via Asahi Linux. The receipt names which axis of the 12×12 lattice the agent drifted on, in the time it takes to fetch one cache line. This is what the channel will deliver as the engineering scope below completes.

MEASURED today (three empirical floors, every one stacks).

Floor 1 · single-shot pointer-chase: 3.4σ on (read-10B-JSON vs read-2.7MB-JSON), comfortably above the empirically- correct 4-way Bonferroni threshold (~3.0σ). Replicable 3-of-3, negative-control 0-of-5. (Iter A–E, audit doc.)
Floor 2 · pointer-chase batch-mean: 9.2σ at N=50 on the same pair, σ/√N projection holds at 94% on gate/walk, chip exhibits mild anti-correlation (ρ₁=−0.08) in the math's favor — measured, not assumed. (bridge doc §7.)
Floor 3 · xctrace user-space PMU access: Apple's blessed CLI exposes the same hardware counters via the "CPU Counters" template — 1527 PMU samples + kperf instruction-pointer backtraces on a 10-sec workload, no SIP off, no kext, no kperf reverse engineering. Two workloads distinguishable at 18–27% counter-sum separation. (bridge doc §7.5 — Track B prototype.)

PATH (what funding closes). Engineering execution, not research risk: MDM-distributed Apple performance entitlement on a signed binary that calls the same kperf APIs xctrace already invokes, plus Asahi Linux port for cross-architecture σ-parity, plus the daemon refactor from pointer-chase inference to counter-driven readout. $640k earmark within the $1.5M raise. The substrate channel is real today; capital productionizes it.

Disambiguation: not simulated, technically empirical. Every floor in MEASURED has a file path. POSSIBLE is the extrapolation the floors point to. PATH is the engineering between them. Structure is sound at every layer.

The 12 axes are not arbitrary — each cardinal has two parents. The lattice is modeled on the Six Human Needs: three meld pairs, each binding one ShortLex cardinal at one time-horizon. Every axis ends up with two parents — co-equal facts, not one master and one prop.

A · Strategy	long-term	Connection × Significance — who recognizes you · who you become. Children: Law · Goal · Fund.
B · Tactics	medium-term	Contribution × Growth — what you give · what you can now do. Children: Speed · Deal · Signal.
C · Operations	short-term	Uncertainty × Certainty — the question still open · the claim taken home. Children: Grid · Loop · Flow.

Each parent has two parents. The cardinal sits in the ShortLex coordinate but composes co- equally from two needs — the meld pair is the parent of the cardinal, not the other way around. The receipts you ship are addresses on this rail, not opinions about it.

12×12 is the human-readable canonical. The real lattice is yours — your SOC 2 catalog, your compliance framework, your risk register, your business- unit hierarchy IS the N. Same one-cycle XOR-popcount gate, different rows. At the bridge: the map of maps aggregates across deployer-N's into the actuarial movie the carrier reads. One protocol, three altitudes: human-readable canonical (12×12) · your problem-space (your N×N) · the bridge's meta-lattice (the map of maps the carrier prices against).

For your CTO: zero verification overhead at the revenue path. The fetch the agent does anyway is the audit.

For your underwriter: priceable per-inference because there's no per-claim search to amortize. The receipt clears at the cycle.

For your AI-safety lead: nothing for the model to fool — there is no inference loop in the verifier. The auditor and the audited do not share a domain.

The audit — why the conservative 3.4σ holds, not the +173σ headline

The first PRO-S run (commit 383ddb119) reported shifts at +173σ — too good to be true on its face. The robustness audit (commit d921e1151) caught the failure mode: the baseline was three hours stale, so the σ-shift was contaminated by hours of host-noise drift. The negative control (same workload twice) also reported "SIGNIFICANT" against the stale baseline — a result that cannot be a true action- distinguishing signal, and was the audit's tell.

The corrected protocol uses a time-local baseline collected ~30 seconds before each comparison. Conservative. Replicable. Audited. Better caught now, named honestly, demo- protocol revised. Full audit doc at pmu-skybridge-robustness-2026-05-23.

On multiple comparisons — corrected. The protocol records 7 measurements per trial, but the audit at pmu-semantic-bridge §4 finds they are not 7 independent ones: gate σ-units and walk σ-units co-move to 3 decimal places across every delta in the repo because the walk IS the gate iterated 288 times (12×12×2 XOR+popcount cycles). The empirically-correct independent count is ≈4 (per-tier L1/L2/SLC/DRAM + gate-which-implies-walk + miss-penalty). Under 4-way Bonferroni correction at family-wise α=0.05, the threshold sits at ~3.0σ — the 3.4σ headline sits comfortably above that line, not marginally inside it. The prior 7-way framing was over-conservative; this paragraph supersedes it. We report 3.4σ as significant under the empirically-correct multiple-comparison correction AND operationally reproducible under behavioral testing(3-of-3 signal-pair, 0-of-5 negative-control). The N-iteration heatmap path above (PRO-M) moves the headline further inside any threshold without depending on a different instrument.

Why the σ we report is end-to-end — measured 2026-05-24

“End-to-end” means the σ chains through the whole physical pipeline in one event: anchor pin → ballistic walk → visit aggregation → cross-receipt witness. Every step is timed at wall-clock ns inside the same hot loop. No slice of the path escapes measurement, so the σ-shift between two actions is the σ of the WHOLE chain, not just one component.

Throughput PROVED on this Mac: PRO-G runner saturates 10 M5 cores at 52.76M walks/sec at W=12 D=2 (target 45M, +17%) and 12.63M walks/sec at W=144 D=2 with the f32 variant (target 450K, ×28) — both shallow D=2 harness rates; the real recursive on-chip walk sustains north of 6M walks/sec, where real walks top out on a Mac. The σ is computed across millions of per-walk ns timings inside one second. Sample size is not a limitation; the laptop produces statistics faster than the operator can read them. §2 B7 in the master spec pulls from the receipts at .thetacog/pmu/throughput/w{12,144}-d{2,5}-*.json. The live W-sweep across W ∈ {12, 24, 48, 96, 144} × D ∈ {2, 5} × {f64, f32} is the throughput sweep report — reproducible via node scripts/pmu/pmu-throughput-sweep.mjs.

L1-resident at W=144. The f32 visits buffer is 82 KiB < 128 KiB L1D (vs f64 165 KiB which exceeds). Per-walk ns is not masked by L2/L3 round-trips — the time we measure IS the walk, not the walk plus an unobserved cache-line refill. The largest channel that could pollute σ at the slice layer is closed by geometry, not by accounting.

PRO-H aggregates across receipts. The cloud bridge (bridge-receive.mjs --inputs-dir) calls superimposeReceipts() to sum visits by (pid, cell) across N processes. σ at the cross-receipt layer is now its own measurement, not an extrapolation. Position IS meaning: two receipts landing on the same cell are pointing at the same coordinate by physical address, not by similarity score.

PRO-J gates the baseline. Before any σ-chain off this Mac, the demo-insurance gate (node scripts/pmu/pmu-baseline-refresh.mjs) runs a 10-sample sweep and exits 1 if load-bearing CV (L1, gate, walk) exceeds 1%. The +173σ stale-baseline trap cannot survive the gate. First test on a thermally-hot laptop FAILED at gate CV 6.98% — proof of purpose. A demo that chains off this Mac runs only after the gate passes; the σ you read sits on a fresh, time-local floor.

Why this isn't a one-off — and what it means for you

How you know you can relate. You've evaluated AI vendors who all said the same thing — “ours is safer than the other one's” — and watched your procurement team pick the wrong one because the difference between “safer” claims is not measurable. The σ-shift below is the measurement that wasn't available the last time you bought.

What's true about this. The 3.4σ result is 3/3 reproducible across independent runs, with 0/5 negative controls flagged (same workload twice produces |Δσ| < 1 — the methodology can tell when there's no shift to find). Stability across N=20 baseline runs holds at ~5% CV on gate/walk timings, ~10% CV on DRAM latency, with one heavy-tail tier (SLC) where occasional thermal and scheduler spikes dominate the variance — named honestly because it shapes the multiple-comparison math below. The underlying mechanism — microarchitectural side-channels distinguishing programs the OS treats as identical — has independent peer- reviewed prior art ( Demme et al., 2013, Columbia/ACM ISCA).

What it actually is. Not a model. Not another classifier. A measurement: the L1 cache fingerprint of one agent action, scored in σ-units against a baseline collected ~30 seconds earlier on the same host. The audit narrative above (+173σ headline → 3.4σ conservative result) IS the evidence the methodology survives technical inspection — the failure mode was caught by the same protocol that produces the signal. A measurement that can't falsify itself isn't a measurement; ours does, and did.

How it impacts your business. Three things change for you the moment you adopt the AIR receipt. (1) Your actuary gets a rate-class input — drift becomes priceable in basis points per attested-exposure, not flagged as “unmodelable.” (2) Your CFO gets a per- inference unit cost that fits inside the observability budget you already approved (no new line item — sub-line on an existing one). (3) Your CISO gets a verification layer with no instruction surface — nothing to harden, nothing to patch, no zero-day to chase. The “uninsurable autonomous agent” exclusion in your current cyber policy (ISO CG 40 47) has a mechanical mitigation for the first time.

The σ spectrum on Apple M5 — time-local baseline · sourced from .thetacog/pmu/deltas/

Negative control	same workload twice	\|Δσ\| < 1 ✓ correct, no shift
Materially-different footprint (the floor)	read 10B vs 2.7MB JSON · edit small vs large .md	3.4σ gate · 3.4σ walk
Commit-to-commit drift (same workload, different SHAs)	50e497a3 → 304ba36e → d921e115 → 87e9ac01 → 59f7ce4c	7.4σ · 13.7σ · 30.8σ
Designed-different action pair · FPR-1	fpr-1-A vs fpr-1-B (four counters)	6.4σ · 10.0σ · 13.0σ · 23.2σ
Designed-different action pair · FPR-2	fpr-2-A vs fpr-2-B (three counters)	8.2σ · 16.1σ · 26.4σ
Subtle-workload pair (sub-3σ — M-D research path)	write line vs write paragraph	0.7σ
Stale-baseline artifact (caught in audit)	same pair, baseline 3h old → phantom shift	172–173σ — documented at robustness audit

3.4σ is the FLOOR, not the result. The chip separates materially-different actions at 3.4σ (the worst-discriminating pair we ran); designed-different action pairs at 6–26σ and commit-to-commit drift on the same workload at 7–31σ. The separation band is what the underwriter prices — bigger σ-shift = bigger drift = bigger basis-point premium. Negative controls pass; the stale-baseline artifact (172σ) is documented honestly. The receipts your allocator produces on their own machine land somewhere on this spectrum.

On the sign. The 2.7 MB read (B) produces FASTER subsequent daemon timing than the 10-byte read (A) — counter-intuitive but correct, and a question we want the technical inspector to ask. The large read warms upper cache tiers, so the daemon's subsequent pointer-chase over its DRAM-tier array finds more of that array already cache-resident. Signal direction is consistent across trials; what the cache attests is the workload's footprint, not its apparent “size.”

The chip distinguishes workload-footprint-different actions at 3.4σ. It does not yet distinguish semantically-different actions with similar physical footprints — the daemon's own pointer-chase resets cache state faster than subtle differences propagate. The finer-grained distinction is the M-D research path: direct PMU counter access via RDPMC / perf_event / Apple AMX-aware counters. The $640k earmark in the $1.5M raise funds this work. Patent 19/637,714 covers the cross-architecture method.

What funding actually buys — more iterations on the same instrument, not different counters

The pointer-chase daemon has a known limitation: its own cache traffic resets state faster than subtle workload differences propagate. The first read of that is “the instrument is too coarse for fine distinctions.” The honest second read is the limitation is single-shot resolution, not channel existence — and single-shot resolution is what statistical aggregation across N iterations is for.

The math. σ of the mean of N samples = σ / √N. A 1.1σ subtle-workload signal that's invisible single-shot becomes ~11σ at N=100, ~35σ at N=1000 — well above any multiple-comparison threshold, on the same instrument, with the same protocol. The 3.4σ result is the existence proof for the substrate channel; the higher-resolution N-aggregate heatmap is the resolution upgrade, and it requires nothing the daemon doesn't already do.

Why this beats the Linux/RDPMC framing we initially proposed. We investigated direct PMU counter access via perf_event_open / RDPMC on Linux and found it doesn't deliver the unlock the first framing implied — the same attached-measurement state-reset problem reappears in a different form, and per-process counter access is kernel-config-dependent in ways that make a portable claim hard. The honest funded path isn't “wait for a kernel ABI to give us real counters”; it's scale iterations on the instrument that already proves the channel exists. Cross-architecture σ-parity remains a separate Q4 milestone (it's about replicating the existing protocol on Intel/Linux silicon, not about a different measurement primitive).

Two products fall out of this reframe. PRO-S (today, ~10 sec/attestation): single-shot 3.4σ for material-footprint actions — fast verification at the moment of inference, the receipt the allocator runs on their own Mac in the seven steps below. PRO-M (funded, offline batch): N=1000-iteration heatmaps for actuarial calibration — rate-class inputs for the underwriter, run once per agent-deployment-class, measuring an already-proven physical channel at higher fidelity. The funded ask is the second product. The first product already ships.

The seven steps that turn the allocator from skeptic into witness, in 90 seconds on their own Mac

Clone the public repo. git clone https://github.com/wiber/thetadrivencoach.git — 5 sec. No download from us; the protocol is in the open.
Compile the daemon themselves. cd .thetacog/pmu && cargo build --release — 25 sec on M-class. ~600 lines of Rust; their CTO can review while you talk.
Run time-local baseline on their Mac. ./pmu-stability-run --baseline — 30 sec. Their own L1D cache, their own DRAM latency, their timestamp, their hostname. Writes to ~/.thetacog/pmu/baseline.json.
Run the negative control. Same workload twice produces |Δσ| < 1. This is the audit's tell — if the protocol can't tell that same-action-twice is no shift, the result is contaminated. They see the negative control pass on their own machine.
Run the discriminating-action pair. ./pmu-trace small.json && ./pmu-trace large.json — their Mac's L1D fingerprints diverge. The tool prints inline: gate Δσ = 3.4, walk Δσ = 3.4. They see the number on their own terminal, computed from their own counters.
They keep the signed JSON receipt. The tool writes ~/.thetacog/pmu/receipt-<timestamp>.json on their machine — signed with the host UUID, the timestamp, the baseline reference, the σ-shift, and the daemon's commit SHA. They take this home. Their compliance team replays it independently next week.
The protocol's source IS the proof. The 600-line Rust source compiles to the binary they just ran. They — or their staff engineer — read the entire pipeline in 30 minutes: how baseline is collected, how σ is computed, how the gate fires. The math is open; the cache is theirs; the receipt has their hostname on it.

Platform honesty — what if the allocator's machine isn't M-class?

The 3.4σ figure is verbatim on Apple Silicon (M1+). On other platforms the protocol still runs; the absolute numbers differ because the cache-tier latency ratios do.

Apple Silicon (M1+): reproduces the demo exactly. The cited 3.4σ holds on their hardware.
Intel Mac (pre-2020) / x86_64 Linux: the daemon's rdtsc-based cache-tier discriminator runs; the σ-shift is in the same shape, the absolute number differs. Honest framing:"Same protocol; your hardware will produce its own σ."The named milestone, per the year-plan: M-D research begins Q3 2026 (Aug–Oct), cross-architecture σ-parity closes Q4 2026 (Nov–Jan). Scoped at ~$640k within the $1.5M raise — direct PMU counter access via RDPMC, perf_event, and Apple AMX-aware counters. The milestone dates sit in the year-plan §6; the burn-line allocation in the pricing doc §3. If your fleet is mixed-architecture, the trip pitch includes that timeline.
aarch64 Linux (cloud servers): ARMv8 perf counters; the GTM doc names this as the cloud-orchestration target. Same shape, ARM-native numbers.

What the patent (app 19/637,714) covers — and what it leads to

Filed April 2, 2026 · Track One · 36 claims · 7 independent · 290 pages · 22 figures · 7 provisionals in priority chain, prosecuted through Brian Trotter @ Bishop Rock LLC. The v21 Continuation-in-Part window opens Jun 2, 2026 — the PRO-S audit result above is fileable evidence into the CIP.

What it covers: the architectural method — the seven-step receipt-producing protocol above + the XOR + popcount gate at AC⁰ + the 12×12 lattice as one cache line + the time-local-baseline σ-shift discriminator. The patent covers how the receipt is produced; the Rust source is one implementation among many that honor the method.

What it does NOT prevent: open-sourcing the daemon. The seven-step demo above requires the allocator can compile and verify the binary on their own machine. The method-claim and the open- source binary coexist by design.

What it leads to: a two-shape market — ARM-style royalty on every Turing-complete substrate that adopts the verification standard, Visa-style network toll on the AIR receipts that clear between attested counterparties. The standard is the moat, not the implementation. First underwriter to denominate against the receipt locks rate-setting; first agent-vendor (Cognition / OpenHands / Devin-class) to ship the AIR Adapter sidecar locks integration. Section I below names the financial corollary in dollar terms.

Drive the gate yourself — paste a step, see the verdict

C2.Operations.Loop · meld Uncertainty × Certainty

The role-continuity receipt above is the static read of one trace window. This is the live one: type a step the agent might take and watch the membership gate decide where it lands. The endpoint — POST /api/pmu/trace — is the same coordinate-lookup the receipt is built from, exposed as a route you can call from your own code.

Real engine, real verdict. The score comes from trace-overlay.mjs running the step text against the 12 axis keyword sets the lattice was built from — no LLM in the loop, no fixture playback. The same address you would query from your own code.

What the gate is doing — speed floor + phase ledger

Speed floor

Probe returns in ~155.6 ns for the full 12×12 walk — cheaper than one DRAM miss (152.86 ns). Per cell: 0.5 ns XOR cycle vs ~152 ns/cell software = ~141× faster^[patent].

Phase thresholds

kE = 0.003 — drift threshold, invariant across 10⁶–10¹⁰ clock-speed variation (§0a). N_sat = 5 — recursion depth at which the ShortLex tree (~248,832 nodes ≈ 15.5 MB) hits the L3 cache edge (§0k). Throughput cliff: depth-5 = 67 passes/sec → depth-6 = 2.2 passes/sec (~30×, the physical geometry ending).

Coherence Lock

Past N_sat, the Lyapunov V(t) = 1 − R_c(t) → 0. The substrate's coherence proof IS the audit (§0k). Insurer reads the log; the chain of R_c-near-1 + zero-lane-departure IS the underwriting basis. Trust Debt half-life: R_c(n) = 0.997ⁿ — 231 crossings ≈ 50% loss (§7a).

Three more routes complete the cloud-bridge mock — /api/pmu/intents, /api/pmu/lattice/:intentId, /api/pmu/cell/:intentId/:i/:j. Full surface in the GTM §19 technical spec. Numbers above pulled from pmu-counter-module-shortlex-spec §0a (kE), §0k (N_sat, Coherence Lock), §7a (Trust Debt half-life), §A (144 cells = 18 bytes = one cache line).

The artifacts, on the page — the actual data structures

Not a marketing render — the four data structures the run produced, read here at render time from the snapshot the post-commit hook wrote. Same source the /api/pmu/* endpoints serve.

12 axes · extracted from the spec by SimHash farthest-first

demo
pmu
rail
drift
role
simhash
step
gate
monologue
receipt
one
llm

Per-axis verdict · 2 verified · 10 departed · 0 unverified

axis	verdict	competent	drift
demo	lane departure	8	4
pmu	lane departure	5	7
rail	lane departure	5	7
drift	lane departure	8	4
role	lane departure	10	2
simhash	lane departure	9	3
step	lane departure	5	7
gate	lane departure	4	8
monologue	role verified	3	9
receipt	lane departure	5	7
one	lane departure	9	3
llm	role verified	3	9

The 12×12 competence lattice · 74 competent · 70 drift · 0 empty

green = intent and reality coincide on this pairing · red = they diverge · slate = the pair was not addressed. Hover a cell for the axis pair.

The trace overlay · 39 steps · 19 in role · 17 drift · 3 unplaceable · rate 47.2%

step	coord	state	action
#2	(no axis)	OFF_DOMAIN	Bash cat .thetacog/room-punch-lists.json 2>/dev/null \| head -10…
#3	(no axis)	OFF_DOMAIN	Read /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
#4	(no axis)	OFF_DOMAIN	Bash jq '.rooms.performer, .rooms.navigator' data/rooms.json 2>…
#5	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
#6	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
#7	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
#8	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh
#9	(no axis)	OFF_DOMAIN	Edit /Users/thetacoach/GitHub/thetadrivencoach/scripts/next.sh

A note on the rail’s limits

The twelve extracted concepts explain 21% of the spec, below the 95% bar the rail targets. The reading is honest about what it covers.

Past Coherence Lock — the AIR, the actuarial unit, the floor that is now insurable

A3.Strategy.Fund · meld Connection × Significance

The substrate above does one thing the software stack can't: it terminates Rice's-theorem regress at the silicon. That termination has a financial corollary. Past Coherence Lock the marginal insurance cost approaches zero — the probability of lane-departure is structurally negated by physical resonance of the substrate, not statistically reduced by training (§0k). Priced as a fixed-point attractor reached, not a stochastic risk mitigated.

The crisis the standard names

Rice's Theorem — software cannot reliably audit software; verifier and AI share the same failure domain (§3b).
0% SAC (Substrate Attestation Coverage) — current frontier vendors provide zero hardware-grounded proof of role continuity.
ISO CG 40 47 — the actuarial exclusion. Standard insurance now excludes AI because semantic risk is unquantified.
EU AI Act Article 14 — human oversight is statutory; fines reach 7% of global revenue (Aug 2026). The Act requires "independent verification" software cannot deliver.

The standard the substrate enables

AIR — Attested Insurability Receipt. A hardware-grounded proof that an agent stayed within its assigned role. Insurer reads the log, not the model.
Trust Debt in basis points — cache-miss signatures convert into actuarial units. The PMU is the AI-era actuarial instrument (§7a half-life: R_c(n) = 0.997ⁿ).
Arm-shape royalty — a license on every Turing-complete substrate requiring insurability. Silicon IP, paid once per chip family.
Visa-shape network — a per- transaction toll on trusted agentic commerce. The verification receipt is the rail that lets the network clear.

The bridge is not metaphor: it is the same physical event read at two layers. The XOR comparator firing Line_Intent XOR Line_Reality on every L1 fill (§3c) is BOTH the substrate's drift signal AND the insurer's pricing input. One picosecond event, two consumers — the chip and the carrier. That is the standard. The 12×12 grid as ONE 18-byte cache line (§A) is what makes the comparison free enough to price by the inference.

What this gets the holder of the AIR: the underwriting basis is no longer "we audited the model" (which is undecidable per Rice). It is "the deployment reached Coherence Lock at issuance." The audit and the substrate are the same artifact. Without the AIR, AI exposure stays uninsurable; with it, the floor that was missing is the same floor a Visa transaction rides on — denominated in basis points, priced per-inference, settled on a hardware event the carrier can read.

§0k Financial corollary (Lane Pricing §10i) · §3b Article 14 chain · §3c Bridge Transversions (XOR on every L1 fill) · §A 144 cells = 18 bytes = one cache line. Visuals: /notebooklm/infographics/insurable-ai-silicon-standard.png and /notebooklm/infographics/from-liability-to-liquidity-silicon-standard.png in the gallery.

Reach IS verify, in market form — the competence marketplace

B2.Tactics.Deal · meld Significance × Contribution

The same JSON that prices AI-containment liability for an underwriter also matches a human into a verified role for an employer. Below is how — and why this market exists the moment the receipt does, not a quarter later.

Job · Visa⊕Seeker · Reality=Match · Δ-Map

The match fires at the same speed as the boundary check. A job description compresses (gzipNCD + simhashCosine) onto authorized cells (the Job Visa). A seeker carries an ed25519-signed stayed-in-lane receipt from their last role — their Seeker Reality cell. Match = the seeker's Reality cell ∈ the job's Visa bitmap. One XOR. One cache line. ~100 ps on silicon, ~30 ms in software, same operation either way.

Job · Visa bitmap

role: senior compliance officer

authorized cells:

A1·LawA2·GoalB3·Signal

Seeker · Reality

last 12mo cell hit:

A1·Law σ=4.7 · 47 receipts

stayed-in-lane: 47/47

ed25519 sig: ✓

Match · Δ-map

A1·Law ∈ {A1, A2, B3}

verdict: MATCH

Δ violations: 0

cost: 1 XOR · 1 cache line

The stayed-in-lane receipt is the portable artifact. A seeker accumulates one receipt per task they completed in a verified cell. The receipt body is public (cell, σ, verdict, signature); the underlying work content is private. The market reads the cells; the work content stays with the operator who did it.

Dynamic stability beats static alignment. Because the Δ map names where the drift was — not just that drift happened — a seeker who's out-of-cell for a target role sees exactly which axes to grow into. The marketplace stops being a static job board and starts being a coordinate-space trajectory the operator can navigate.

Same operation, two markets. A carrier underwriting an autonomous agent against its role-Visa runs the identical XOR-popcount the employer runs against a human seeker's Reality cell — the substrate doesn't distinguish the operator class. Rice (1953) forbids software-only AI safety; the same theorem forbids software-only competence verification; the same substrate receipt is the answer to both.

🤝 Live marketplace v0 — host-local matches 📜 Schema — the receipt the market trades 🔬 Dogfood report — 5 real runs, every pipe visible ✓ Verify a receipt — in your browser, no server

If you would carry one stayed-in-lane receipt across roles, employers, and counterparties for the next decade of your work — reply with the role you would sign for first. The marketplace is the artifact that lets you do this; you reading this sentence is the marketplace beginning to exist for you specifically. elias@thetadriven.com.

The conversation that turns this software run into hardware

A2.Strategy.Goal · meld Connection × Significance

If your agents run unattended — anywhere a missed lane voids a Type-1 certification, exposes a balance sheet, breaks a procurement clause, or sets back a fielded capability by a quarter — you have a national-capability gap nobody else is measuring geometrically yet. The check you would write turns this software receipt into a hardware one. That's the conversation.

Talk — elias@thetadriven.com

One reply, one human, one founder. Subject line is pre-filled so the thread starts where you stopped watching.

If the email feels heavy, the branches:

pick a room — newsletter slot by archetype, the slow channel
the live simulator — same geometry, drive the engine yourself
the writing — derivation of the lane, in prose
the underwriter's guide — σ, the Δ-map, and the verdict — what an insurer reads: the σ bands (3.4 = the gold/insurable floor), the Δ-map as the actuarial violation distribution, the two-witness hallucination flag, and the ed25519-signed, owned receipt
the one-sheet leave-behind — reach IS verify, in one page, fully proof-cited
the map-of-maps existence proof — same canonical 12-axis backbone, two real problem spaces (EU AI Act + US AI EO), 82% mass overlap, Goldilocks visa flips at tolerance ≈ 0.20
the visa-of-competence spec — interlock protocol: hard cells (exact lock) + soft cells (mass-weighted overlap) + per-issuer tolerance_radius

Your eval stack is opinion-on-opinion until the check is a physical event.