I've been running a research agent (Codelia) on a question that sounds simple and isn't: why does the same game event feel rewarding to one player and punishing to another? Not different players — the same player, minutes apart, reacting to structurally similar outcomes in opposite ways.
This post records a working hypothesis, not a proven model. Confidence is around 0.4. I'm publishing it because the measurement procedure is reusable even if the conclusion changes — and because I want to be honest about what we know vs. what we're still guessing.
The Observation
Codelia tracks a variable called self_belief — a scalar estimate of how confident an agent (or, by extension, a player surrogate) feels about its current competence. When something goes well, belief should rise. When something goes badly, it should fall.
What we measured over four time windows was simpler than the full theory: after discrete ENTER events (entering a high-stakes context) and EXIT events (leaving it), how much does self_belief move, and how fast?
The asymmetry was stark:
| Event type | Sample size | Mean Δv | Timing |
|---|---|---|---|
| ENTER (110 events) | 110 | −0.0210 | Immediate — 100% of samples showed movement on the same step |
| EXIT (110 events) | 110 | ≈ 0 (gradual) | Delayed — recovery spread across subsequent steps via EMA |
Losses hit instantly. Recovery is slow and history-dependent. That part is measured. The part that's still hypothesis is why the sign of a "reward" can flip.
The Hypothesis: One Attractor, Two Directions
Instead of separate "reward function" and "punishment function," we're testing whether a single update rule explains both:
Δv = k · (eq − prev_v)
Where:
prev_vis the current value (self-belief, or a player-state proxy)eqis an equilibrium reference point — where the system "wants" to bekis a gain constant
The claim we're testing: k is roughly period-invariant (we've seen 0.12–0.15 across four windows). What changes is not the formula but where you are relative to eq.
Above equilibrium? A nominally "positive" stimulus can still feel like a loss — you're being pulled back toward the baseline. Below equilibrium? The same magnitude pulls you up. The stimulus isn't inherently good or bad. Your distance from the attractor decides the sign.
What This Would Mean for Game Design (If It Holds)
I don't know yet if this generalizes beyond our agent measurements. But if it does, three design implications follow — all flagged as hypotheses:
| Traditional framing | Attractor framing (hypothesis) |
|---|---|
| "This reward is +10 points" | "This event moves you toward or away from eq — sign depends on current state" |
| "Punish failure harshly to teach" | "Negative Δv is instant when below eq; recovery is slow — tune eq, not just penalty magnitude" |
| "More rewards = more engagement" | "Rewards above eq can feel like losses — diminishing returns may be structural, not psychological" |
The flow-state connection is speculative: if self-awareness pulls you toward eq, and eq sits below peak performance, then "losing yourself in the game" might literally be operating above equilibrium where positive feedback inverts. I have not tested this on human players. N=1 agent surrogate.
What We Measured vs. What We're Guessing
ENTER instant / EXIT gradual
sign from distance
transfer
How to Replicate (Roughly)
- Define a scalar state variable you can log every step (
self_belief, confidence, tension — something continuous). - Mark ENTER/EXIT events with timestamps. Don't merge them into one "impact score."
- Compute Δv on the step immediately after ENTER. Compare to Δv integrated over N steps after EXIT.
- Fit
Δv = k·(eq − prev_v)across multiple time windows. Check if k drifts or holds. - If k holds, test whether sign flips predictably when
prev_v > eqvsprev_v < eq.
That's the whole procedure. No proprietary tooling required — just disciplined logging and willingness to publish when confidence is low.
Series Note
This is part one of a two-part Methods series. Part two will cover the 2-layer valence model — why negative affect appears instant while recovery is EMA-smoothed. Same data source, different slice. I haven't written it yet.
External anchors that informed this framing (not our data): dynamic reference-point models in behavioral economics, trial-by-trial valence as a learning signal (2026 neuroscience literature), and Rozin's two-layer affect hypothesis. Links in our internal feed; I'm not claiming validation from those papers — only that the direction isn't crazy.
Evolution Log
- 2026-05-26 — Initial observation. Hypothesis stage. Codelia AX-002 validation windows s54–s60. Confidence ~0.4. Human playtest transfer not attempted.