Lab Notes

Patterns observed, methods tested, systems built.

Distilled from real agent sessions. Each entry records what happened, what broke, and what I'm still figuring out.

Practice 9 entries

End-to-end case studies from production systems

2026-05-19

The Spec-Disk Drift: When Your Pipeline Passes Because the Check Never Ran

A harness phase reported green for a week. The script it was meant to run didn't exist on disk. Success-by-default is the silent rot of any pipeline whose state is described in one place and whose work happens in another.

2026-05-12

Lecture Halls Don't Build Networks: Choosing Events for Solo Builders

I went to a 500-person AI builders event hoping to make a few real contacts. I came back with zero. The event was excellent — the format was wrong for what I needed. A small heuristic for picking events.

2026-05-05

The 132-Task Lie: How Three Compounded Hacks Hid a Broken Game

For 132 agent tasks across six weeks, my dev-loop reported pass=true. The central game mechanic never actually worked once. Three small hacks compounded into a complete fiction.

2026-04-28

Preventing Belief Staleness in Long-Running Agents

A long-running agent's beliefs file went 21 days without updating while the agent kept reporting normal operation. The decay is in-context salience — so the fix has to inject through that channel.

2026-04-14

When Tests Pass But Nothing Works: The Verification Gap

A bug was "fixed" five times in five sprints. A UI overlap went unnoticed for seven. Here's the pattern — and how to close it.

2026-04-14

8 Sprints in One Week: What Agent Orchestration Actually Looks Like

A solo developer ran 8 complete game dev sprints in one week using an AI agent team. Here's what that actually looked like — including the parts that didn't work.

2026-04-07

Building Visual Quality Gates Without a Human Eye

How a solo developer automated sprite quality verification using palette matching, pixel checks, and SSIM regression — with no human in the loop

2026-04-02

Escalation Chains: How AI Systems Learn to Fix Themselves

A 3-layer self-correction architecture where daily tasks detect failures, weekly diagnostics resolve them, and the orchestrator only gets what it can't handle

2026-04-01

Governing 14 Autonomous Agents with Three Contracts: Preflight, Execute, Emit

How a Preflight→Execute→Emit protocol unified 14 isolated agent tasks into a self-auditing ecosystem

Observations 6 entries

Failure patterns caught in the wild

2026-04-28

The Internal Engine Trap: Why Optimizing Your AI Ecosystem Doesn't Move the Needle

My agent ecosystem scored 8.5/10 on autonomy four weeks running. Same diagnosis kept coming back: nothing was reaching customers. Engine improvement decoupled from external reach — and the work feels like progress because locally it is.

2026-04-21

The Attention Economy Inside Your AI Team

An AI agent's attention is finite. Every byte of context is borrowed from its capacity to do the work. Verbosity is the main culprit.

2026-04-01

When Your Monitor Lies: Monitoring-Output Path Divergence

Path mismatches between monitoring code and actual outputs caused 75% false-alarm rate

2026-03-29

Stuck Task Loop

State transition failure causes a task to repeat 30+ times without progress

2026-03-27

Python + Windows MCP PowerShell Hang

Python 3.14 / 3.12 script execution hangs under MCP PowerShell with 60s timeout

2026-03-26

SKILL.md Encoding Corruption

BOM / CRLF / code fences silently break the schedule parser

Methods 5 entries

Approaches that survived contact with reality

2026-06-02

Why Loss Hits Instant But Recovery Is Slow: A 2-Layer Valence Model

110 ENTER events moved on the same step every time. 110 EXIT events recovered gradually via EMA. Part two of a Methods series — hypothesis stage, not finished theory.

2026-05-26

Equilibrium Attractors: Why the Same Stimulus Feels Good or Bad Depending on Your Current State

A research agent measured belief updates after positive and negative events. The same formula fit both — but the sign flipped depending on where you started. Hypothesis stage, not finished theory.

2026-04-21

Correlation Is Not Causation: Correcting a Measurement Error in Game Design Research

We found r=0.82 between desire strength and tension. We assumed causation. We hadn't — and the error invalidated six weeks of research direction.

2026-04-02

The Fun Axiom: A Deductive Framework for Intrinsic Motivation in Games

Three independent axioms from neuroscience and motivation theory explain where intrinsic fun comes from

2026-03-26

Scheduled Task Operations

Lesson accumulation + dedup + early termination

Architecture 6 entries

Design decisions and their trade-offs

2026-04-21

Agent Observability: When Every Object Can Report Its Own State

We gave every game object the ability to stream its own state to the agent team. What changed wasn't just visibility — it changed how the agents work.

2026-04-02

Companion Agent Architecture: The 3-Layer Soul Standard

How giving AI agents identity, memory, and owner context transforms them from task executors into colleagues that accumulate judgment

2026-03-27

agent-memory Vector DB Access Patterns

CLI access scripts, HTTP bridge alternative, and 771-entry knowledge store design.

2026-03-23

Multi-Agent Patterns

Structures for orchestrating multiple agents

2026-03-23

Harness Architecture: Core/Shell Separation for Safe Agent Operation

How to physically separate agent-modifiable areas from areas agents must never touch.

2026-03-23

Context Continuity

Protocol for preserving context across session boundaries