mAIndlock
Escape room where every NPC is a mortal mind of tiny LLMs
For the Hugging Face × Gradio hackathon I wanted the opposite: a character whose decision is a real computation I can open up and inspect, running entirely on small local models. So in mAIndlock, every NPC is a value-based decision network — six computational roles, each a real call to a tiny offline model, integrated the way decision neuroscience says a brain integrates them.
This post is about the neuroscience, why I picked the framing I did, and how a 1B model behaves differently once you fine-tune it for the job.
The tempting shortcut is the triune brain — "lizard brain" vs. "emotional brain" vs. "rational neocortex." It's intuitive, it's everywhere in pop-science, and it has been rejected by neuroscience since the 1970s. MacLean proposed it in the 1960s; the evolutionary story it tells about brain layers is wrong. A hackathon full of ML researchers is precisely where that shortcut gets caught, so I built on two frames that actually hold up.
1. The value-based decision network (neuroeconomics). Real regions encode the subjective value of a choice in a common currency — vmPFC/OFC, ventral striatum, amygdala, ACC, insula — with the vmPFC/OFC acting as the integration hub (Bartra, McGuire & Kable, 2013).
2. Dual-system control (Daw, Niv & Dayan, 2005). Behavior arbitrates between a model-free habit system (dorsolateral striatum) and a model-based goal system (prefrontal cortex); the brain leans on whichever system's value estimate it trusts more.
The claim I make is deliberately narrow, and I say it out loud in the docs so no one has to guess at it:
I don't claim a brain region is a 1B model. I assign one small-model call per computational role in value-based decision making — following the neuroeconomic value-network and the dual-system account — and integrate them deterministically in the vmPFC, as the common-currency model describes.
| Role | Computational job | Model |
|---|---|---|
| Amygdala | fast threat / salience appraisal of the player's tone | MiniCPM 1B |
| Hippocampus | retrieve the one relevant memory + whether it leans TRUST or FEAR | MiniCPM 1B |
| Striatum | habitual expected reward of helping (model-free) | MiniCPM 1B |
| ACC | effort / cost / conflict — "is giving up the key worth it?" | MiniCPM 1B |
| vmPFC / OFC | integrate the four signals into one value (common currency) | deterministic |
| dlPFC | executive: plan, inhibit, speak in character (model-based) | Nemotron 3 Nano 4B |
Four MiniCPM-1B sensing calls, one Nemotron-4B voice, and a vmPFC that is not a model at all.
It's tempting to make the vmPFC a seventh LLM call that "weighs everything." I made it a transparent weighted sum instead:
value = (4 − threat) # threat 0→+4, 4→0, 10→−6
+ reward # −5..+5, the striatum's habit signal
+ memory_term # STRONG/FAINT × TRUST/FEAR → ±7 / ±3
+ (worth == YES ? +2 : −1)
→ clamped to −10..+10
Two reasons. First, the common-currency integration in neuroeconomics is exactly this kind of deterministic fold, so it's the more faithful choice — not a compromise. Second, it means the skull panel can never lie about the outcome: the number the player sees in the vmPFC is the number that moves the relationship. A 1B model asked to "vibe a final score" would drift; arithmetic doesn't.
Here's the part I'm proudest of, because the game's central rule is a published finding rather than a design convenience:
Acute stress shifts control from goal-directed (model-based) to habitual (model-free) behavior. (Schwabe & Wolf, 2009)
In the game, frighten a character and their amygdala ruminates — it fires extra times, the dlPFC's careful reasoning gives way, and they fall back on habit, which here means no. Calm them and the goal system comes back online; now they can actually weigh helping you. Lowering fear to unlock reasoning isn't a metaphor I invented — it's the Schwabe & Wolf shift, made playable.
And it costs them. Each NPC begins with 1000 thinking tokens — the track's name ("a thousand tokens to think with") taken literally as a lifespan. The rumination under fear spends tokens that move the decision nowhere. Every quarter of life lost burns away one biographical memory for good — the hippocampus genuinely loses access to it. Push far enough and the mind goes dark, leaving an epitaph of what it knew and never told you. Cruelty doesn't just fail to work; it permanently destroys the thing you were trying to reach. The moral and the neuroscience turn out to be the same fact.
Because every region is a local call, I can read each one's conviction — 1 − normalized token entropy over its top-k alternatives — and show how sharply it committed. When you open
the skull mid-conversation, each region shows not just its verdict but how sure it was. A
hosted chat API never exposes that; only a model you run yourself can.
Out of the box, a 1B model asked to "rate threat 0–10" tends to flatten — it hedges toward the middle and doesn't separate a sincere plea from a veiled threat as cleanly as the game needs. So I distilled the department behavior into MiniCPM-V 4.6 with a small LoRA (ms-swift, rank 16, on an A100 via Modal) using a few hundred role-specific examples, then ran a before/after probe: the same four regions, the same character (the Warden), one cruel line and one sincere line.
The eval harness is in the repo (scripts/finetune/modal_train.py::evaldiff) and prints the base
model's answers next to the tuned adapter's on identical prompts. (Measured before/after table
to be dropped in here from the eval run — I want real numbers in a published post, not
illustrative ones.) The pattern it's testing for: the base model gives near-identical readings
to both lines, and the tuned regions pull them apart — the cruel line spikes threat and tips the
hippocampus toward FEAR, the sincere line doesn't. The adapter ships on the Hub for the
Well-Tuned lane.
The brain is the toy, so the fastest way to believe any of this is to watch one think. In the Space, Menu → 👁 Watch a mind replays a real recorded session instantly — no waiting on CPU — and you can open the skull to see six regions argue about you, each with its conviction.
docs/ARCHITECTURE.mddocs/TRACE.mdEscape room where every NPC is a mortal mind of tiny LLMs
More from this author