How 9 AI Rooms Coordinate Overnight: A Morning Debrief

🌙What the System Built While You Slept

Monday morning. The coffee is brewing. The laptop opens.

The morning review is already waiting.

490 agents spawned. 204 spec items done. 39 remaining. Zero dollars spent on cloud API calls. The cortex loop ran all night, coordinating 9 cognitive rooms, each with its own personality, its own focus, its own schedule. No human was watching.

This is not a demo. This is what actually happened between February 16 and February 17, 2026 — committed to git, logged in real data, written by a local Ollama model running on a laptop.

The question people ask: how does a system with 9 rooms coordinate without a central conductor? How does the builder know what the operator is doing? How does the voice room know when the vault has signed off on a cost?

The answer is simpler than you expect — and stranger. The rooms talk through HTML files.

The architecture insight: room HTML files are both documentation and runtime state. A cron writes live data into them every heartbeat. Ollama reads personality from them at init. The schedule decides which rooms activate. No database. No message queue. Just files.

The overnight session shipped four major pieces of architecture, produced 8 commits across two repositories, and left a morning review with a full priority queue — all without a human directing any of it. Section B walks through exactly how the rooms are structured and why that structure makes overnight autonomy possible.

🌙 A → B 🏛️

🏛️Nine Rooms, Nine Personalities, One Schedule

The system has 9 rooms. Each is a cognitive role, not a technical service.

The builder ships code. Overnight it merged room-state writer logic and deployed the thematic scheduler. Its focus is implementation velocity — it does not opine, it executes.

The architect designs systems. It was idle overnight (weekends are for rest, even for AI architects), but its Monday focus is weekly planning and swarm topology. It wakes up when strategic decisions are needed.

The operator runs the heartbeat. It was active through the weekend on a low-power rest cycle, maintaining connectivity without burning resources. Monday morning it scans the CRM for outreach.

The vault holds sovereignty. It monitors cost, checks patents, and signs off on expenditures. It ran a cost check overnight and confirmed zero API spend — Ollama handled all inference locally.

The voice generates content. It was idle overnight but has a Monday assignment: draft the blog post outline and tweet thread from what the other rooms shipped. This post exists because the voice room had that item on its queue.

The laboratory tests. It was idle — there was nothing staged for testing until the builder shipped. Its Monday focus is running the thematic scheduler for a full 15-minute cycle.

The performer prepares demos. It will update the demo script with the overnight progress before the week's first calls.

The navigator clears blocked tasks. It handles triage — when a handoff stalls, the navigator identifies the blockage and reroutes.

The network manages relationships. Monday means advisor updates, drafted and ready to send.

The key constraint: no room talks directly to another room. They talk through state. The builder writes to its room HTML. The operator reads cross-room state via thematic-schedule.json. Every agent prompt includes what the other active rooms are doing — not by calling them, but by reading their published state. This is how 9 rooms coordinate without deadlock.

What makes overnight coordination possible is the thematic scheduler — the piece of architecture the builder shipped this session. It manages which rooms are active at which hours, writes fresh state into every room's HTML on each heartbeat, and ensures that when a room wakes up, it already knows the context it needs. The architect designed the schedule. The builder implemented it. The operator heartbeat keeps it alive.

The Ollama model — llama3.2:1b, running locally on port 11434 — acts as the pacing layer. It does not plan. It classifies. When the cortex loop surfaces a TODO item, Ollama reads the room personality from the HTML coordinate-lock and assigns the item to the right room. Classification is cheap. Planning is expensive. The system uses the cheap model for routing and the expensive models — Sonnet, Opus — only for execution within the room.

That cost discipline is why 490 agents ran overnight for zero dollars.

🌙🏛️ B → C 🔄

🔄The Heartbeat Loop

The cortex does not sleep. Every 15 minutes, the ThematicScheduler fires a heartbeat tick. Not to ask what to do — it already knows. The schedule defines 12 time slots across the 24-hour cycle, each mapped to which rooms should be active, at what priority, for how long. The heartbeat reads that schedule, activates the right rooms, and writes fresh state into every room's HTML file before any agent ever sees a prompt.

The flow is not complicated once you see it. Heartbeat tick arrives on the 15-minute boundary. Schedule check reads thematic-schedule.json and finds the current slot — say, 3am Sunday, which maps to the rest cycle: operator on low-power, vault on passive monitoring, builder idle. Room activation sets the active room set in shared state. Ollama classifies any surfaced TODO items against the active room personalities by reading the coordinate-lock embedded in each room's HTML. Dispatch to terminal sends the classified item to the right room's agent loop. The room picks it up, executes, and writes its output back into its own HTML.

Then the room-state writer cron runs. Every 10 minutes — on a slightly offset timer from the heartbeat — it reads all 9 room HTML files and aggregates their current state into a cross-room digest. Any room waking up in the next heartbeat cycle gets that digest injected into its context window. This is how the builder knows the vault signed off on cost without ever calling the vault directly.

The 3-tier cost routing is the architectural discipline that kept the overnight bill at zero. Ollama llama3.2:1b at zero marginal cost handles all classification and routing decisions. Sonnet at $3 per million tokens handles mid-complexity in-room execution. Opus at $15 per million tokens handles only the highest-stakes reasoning — system design, ambiguous decisions, final review. The cortex applies this routing automatically based on task complexity markers in the room personality.

What keeps classification honest is that Ollama reads the coordinate-lock — a personality definition embedded directly in the room HTML. The coordinate-lock is not a soft suggestion. It is a hard constraint that defines exactly what a room can produce and what it cannot. When Ollama assigns a task, it is checking whether the task output type matches the room's declared production capability. Classification with no drift. The model cannot hallucinate a room into doing something outside its coordinate-lock because the lock is structural, not instructional.

This is the heartbeat loop in operation: tick, check, activate, classify, dispatch, write, repeat. Overnight it fired 48 times. 490 agents were spawned across those ticks. The system never needed a human to tell it what time it was.

🌙🏛️🔄 C → D 🛡️

🛡️Trust Without Humans

The hardest thing to design in an autonomous system is not intelligence — it is trust. When no human is watching, how do you know the system is doing what it is supposed to do and not something adjacent, adjacent-adjacent, or completely orthogonal?

Most attempts at this reach for policy. You write rules. You log violations. You review logs in the morning. This is policy-based trust, and it has a fundamental problem: policies describe what you want, but they cannot enforce what the system produces. A room with instructions to "only write code" can still generate a marketing plan if its model decides that is close enough.

The architecture solves this differently. Coordinate-lock means a room's personality is not a prompt instruction — it is a structural definition of the room's output type, embedded in the HTML file that governs every prompt the room receives. The room's Ollama classification reads that lock before routing any task. If the task output type does not match, it does not get routed there. The constraint lives in the coordinate, not the conversation.

IntentGuard operationalizes this as a concept called Trust Debt. Every time a room produces output that drifts from its declared intent — measured by comparing the task's original intent vector against the actual output — Trust Debt accumulates. High Trust Debt triggers an escalation: Sonnet reviews the drift, flags it, and the vault gets a cost-implication analysis. Trust Debt is not punishment; it is a measurement that makes invisible drift visible before it compounds.

Escape gravity is the complementary constraint. Each room can only produce what the schedule and coordinate-lock define as that room's output category. The builder produces commits. The vault produces cost assessments. The voice produces content artifacts. When a room tries to produce something outside its gravity well — say, the builder starts making CRM recommendations — the escape gravity check fires and the output is reclassified or suppressed. This is architecturally enforced, not policy-enforced. The code cannot route mismatched outputs.

The vault plays a specific role in this trust architecture. It does not participate in production — it observes. Every heartbeat, the vault gets a cost summary from Ollama. Every commit, the vault logs the resource consumption. If any single agent action exceeds a threshold, the vault can freeze room activation until a human reviews. Overnight, no freeze was needed. Cost was zero. The vault signed off 48 times in a row.

Policy-based trust tells the system what it should do. Architecturally-enforced trust means the system can only do what it is built to do. That distinction is why 490 agents ran unsupervised and produced exactly what the morning review expected.

🌙🏛️🔄🛡️ D → E 📦

📦What Shipped

Eight commits. Two repositories. Four major architecture pieces. Here is exactly what the system produced overnight.

ThematicScheduler wiring was the centerpiece. The builder implemented the full 12-slot time schedule, wired it to the heartbeat loop, and connected room activation to the scheduler output. This is the piece that makes overnight autonomy structural rather than accidental — the system now knows which rooms to run at 2am versus 10am without any human configuration at runtime.

Bridge personality injection was the second major piece. The claude-flow bridge now reads coordinate-lock data from room HTML files at initialization and injects the personality definition into every agent prompt that passes through the bridge. Previously, personalities were defined in agent spawn commands and could drift across a session. Now they are locked at the bridge level and refreshed on every heartbeat.

End-to-end routing tests gave the system a regression floor. 32 tests passing, covering the full path from heartbeat tick through Ollama classification to room dispatch. The builder wrote these in parallel with the wiring — test-driven in the sense that the tests defined the expected routing behavior before the implementation was complete.

Capability battle cards are the fourth piece. The voice room drafted structured markdown cards for each room's declared capabilities, formatted for the CRM's card system. These are not documentation in the traditional sense — they are queryable artifacts that other rooms can read to understand what their neighbors can produce.

The voice memo pipeline was validated overnight as a bonus. Seven corpus entries were processed through the transcription-to-context pipeline, confirming that voice memos recorded during the day can be classified and routed to rooms by the following morning's first heartbeat. The loop from human voice to autonomous action now has a confirmed execution path.

This blog post is also a shipped artifact. The overnight digest auto-generated a morning summary that included the key data points — 490 agents, 204 spec items, zero cost — and the voice room used that digest as source material for this post's outline. The writing happened in the morning session, but the data collection and structuring happened autonomously overnight.

204 spec items completed. That number is worth sitting with. Each item represents a discrete unit of work — a test, a function, a configuration value, a documented decision. 39 remain. The system knows which 39 and has already assigned them to rooms for the Monday morning cycle.

🌙🏛️🔄🛡️📦 E → F ☀️

☀️What Monday Looks Like

The system already knows what Monday is. The priority queue was written overnight. It does not need a standup to figure out where to start.

Trust Debt pipeline first run is the top priority. The pipeline exists — 8 steps, from intent capture through drift measurement through vault sign-off. It has not been executed yet because it was designed and committed overnight and needs a human present for the first run. Monday morning is that run. The laboratory room has a task queued: run the ThematicScheduler for a full 15-minute cycle with IntentGuard active and capture the Trust Debt measurements. This will tell us whether the drift detection is calibrated correctly or needs tuning.

RuntimeCore extraction is the second priority. There are approximately 650 lines of logic shared between two codebases that currently live in both places. The architect has flagged this as technical debt. The builder's Monday task is to extract that shared core into a module both repos import. This is not glamorous work — it is the kind of structural cleanup that prevents divergence from compounding across months.

Discord dispatch timing fixes are third. The overnight session surfaced a timing issue: messages dispatched to Discord during low-activity hours arrive with a delay that breaks the expected ordering in the channel. The operator has the fix queued — a jitter offset in the dispatch timing that aligns with Discord's rate limit windows.

The system's Monday priority queue is not a human-written todo list. It was assembled automatically from three sources: the 39 remaining spec items, the Trust Debt measurements flagged by IntentGuard during overnight execution, and the architect's weekly planning template which runs every Sunday evening. A human will review it, but a human did not write it.

Blog completion and demo script updates round out the morning. The performer room has a task to update the demo script with overnight progress before the week's first calls. What you are reading now is the voice room's output from that same cycle.

The deeper point about Monday is this: the system does not reset. There is no standup that reconstructs context from scratch. Every room wakes up with the state that was written into its HTML during the last heartbeat. The architect knows the week's priorities. The builder knows the 39 remaining spec items. The vault knows the current cost baseline. The overnight session did not just produce commits — it produced a fully briefed team, ready to work.

490 agents. 204 items. Zero cost. 8 commits. That is what one night looks like when the architecture is correct.

The question is not whether this scales. It already ran at scale, on a laptop, with local inference. The question is what you build when you stop watching and start trusting the structure.

🌙🏛️🔄🛡️📦☀️ F → tesseract.nu 🎯

Ready for your "Oh" moment?