mouse-run-run — TRANSFORMER agents (mouse-run-run-2-transformer)

Trained chaser/explorer agents for a modern PyTorch reproduction and architecture extension of the MARL experiment in Zhang et al. (2025), Inter-brain neural dynamics in biological and artificial intelligence systems, Nature 645, 991–1001.

Architecture: 2-layer causal transformer (no evolving state). Analysis (hidden) dimension 256; two independently parameterized actor-critic agents (chaser and explorer, no weight sharing).

This repo holds one full experiment: 20 trained agent pairs (tasks social and non_social × seeds 0–9), each trained with PPO for 20,000 updates × 40 episodes × 100 steps. 20/20 units completed on the first attempt.

{social,non_social}/seed_XXXX/attempt_01/checkpoints/latest.safetensors — trained chaser+explorer weights (safetensors; config/metrics in metadata).
paper_rollouts/*.safetensors — 25×500-step analysis rollouts per pair: per-timestep hidden states, positions, actions, and social-event flags (schema v3, time-aligned; degenerate episodes flagged).
paper_random_eval.jsonl — standardized random-opponent evaluation.
derived/{tables,reports}/ — canonical tables and the per-experiment report.
cross_architecture_study/ — the four-architecture comparison report, figures, and cross-architecture PLSC/CKA data.
manifest.json, raw_records.jsonl, launch_gate.json, triton_equivalence.json — provenance.

Key results (this architecture)

Behavior (vs standardized random opponent, social task):

chaser collisions / episode: 41.31 (non-social control: 3.39)
chaser partner-in-vision: 0.96

Neural (social agents):

collision decoding (balanced acc): 0.85
partner escape / approach decoding: 0.76 / 0.76
PLSC shared-dimension top correlation: 0.81

See cross_architecture_study/report.md for the four-architecture comparison. Headline: the architectures do not learn the same internal representations — only the RNN develops genuine internal shared dynamics at low mutual vision.

Load a checkpoint

from huggingface_hub import hf_hub_download
from mouse_run_run.serialization import load_checkpoint
from mouse_run_run.policy import build_policy

path = hf_hub_download("JacobLinCool/mouse-run-run-2-transformer",
    "social/seed_0000/attempt_01/checkpoints/latest.safetensors")
config, metrics, chaser_state, explorer_state = load_checkpoint(path)
chaser = build_policy(config["architecture"], config["env"]["observation_size"],
                      hidden_size=config["hidden_size"])
chaser.load_state_dict(chaser_state)

Code: https://github.com/JacobLinCool/mouse-run-run (paper reproduction + cross-architecture study).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

JacobLinCool
/

mouse-run-run-2-transformer

mouse-run-run — TRANSFORMER agents (mouse-run-run-2-transformer)

Contents

Key results (this architecture)

Load a checkpoint