mouse-run-run β TRANSFORMER agents (mouse-run-run-2-transformer)
Trained chaser/explorer agents for a modern PyTorch reproduction and architecture extension of the MARL experiment in Zhang et al. (2025), Inter-brain neural dynamics in biological and artificial intelligence systems, Nature 645, 991β1001.
Architecture: 2-layer causal transformer (no evolving state). Analysis (hidden) dimension 256; two independently parameterized actor-critic agents (chaser and explorer, no weight sharing).
This repo holds one full experiment: 20 trained agent pairs (tasks
social and non_social Γ seeds 0β9), each trained with PPO for 20,000
updates Γ 40 episodes Γ 100 steps. 20/20 units completed on the first attempt.
Contents
{social,non_social}/seed_XXXX/attempt_01/checkpoints/latest.safetensorsβ trained chaser+explorer weights (safetensors; config/metrics in metadata).paper_rollouts/*.safetensorsβ 25Γ500-step analysis rollouts per pair: per-timestep hidden states, positions, actions, and social-event flags (schema v3, time-aligned; degenerate episodes flagged).paper_random_eval.jsonlβ standardized random-opponent evaluation.derived/{tables,reports}/β canonical tables and the per-experiment report.cross_architecture_study/β the four-architecture comparison report, figures, and cross-architecture PLSC/CKA data.manifest.json,raw_records.jsonl,launch_gate.json,triton_equivalence.jsonβ provenance.
Key results (this architecture)
Behavior (vs standardized random opponent, social task):
- chaser collisions / episode: 41.31 (non-social control: 3.39)
- chaser partner-in-vision: 0.96
Neural (social agents):
- collision decoding (balanced acc): 0.85
- partner escape / approach decoding: 0.76 / 0.76
- PLSC shared-dimension top correlation: 0.81
See cross_architecture_study/report.md for the four-architecture comparison.
Headline: the architectures do not learn the same internal representations β
only the RNN develops genuine internal shared dynamics at low mutual vision.
Load a checkpoint
from huggingface_hub import hf_hub_download
from mouse_run_run.serialization import load_checkpoint
from mouse_run_run.policy import build_policy
path = hf_hub_download("JacobLinCool/mouse-run-run-2-transformer",
"social/seed_0000/attempt_01/checkpoints/latest.safetensors")
config, metrics, chaser_state, explorer_state = load_checkpoint(path)
chaser = build_policy(config["architecture"], config["env"]["observation_size"],
hidden_size=config["hidden_size"])
chaser.load_state_dict(chaser_state)
Code: https://github.com/JacobLinCool/mouse-run-run (paper reproduction + cross-architecture study).