NanoWM-B/2 · DINO-WM / pusht

Phase-2 baseline for the pusht environment from the DINO-WM suite, trained with the best ablation config (pred_target=v, additive action injection, cosine + ZTSNR) on NanoWM-B/2 for 100,000 steps.

Run identity

Training setup

Key Value
Architecture NanoWM-B/2 (~158.6M params)
Dataset DINO-WM pusht (osf.io/bmw48)
Frames × resolution 4 × 224² (DINO latent space)
Context frames 1
Action injection additive
Steps 100,000
Batch 8/GPU × 8 × H20
Optimizer AdamW, lr 1e-4, wd 0.01
Precision bf16-mixed, torch.compile on
Seed 3407

Diffusion setup

Key Value
pred_name v
noise_schedule squaredcos_cap_v2 (cosine)
zero_terminal_snr true
timestep_sampling logit_normal
snr_gamma 5.0
diffusion_steps 1000 train · 250 DDIM sample

Loading

git clone git@github.com:knightnemo/nano-world-model.git
cd nano-world-model
huggingface-cli download knightnemo/nanowm-b2-dino-wm-pusht-100k --local-dir ./ckpt
import sys
from omegaconf import OmegaConf
from safetensors.torch import load_file
sys.path.insert(0, "src")
from models import get_models

cfg = OmegaConf.load("ckpt/config.yaml")
cfg.experiment.infra.compile = False
model = get_models(cfg).eval()

state_dict = load_file("ckpt/model.safetensors")
model.load_state_dict(state_dict, strict=True)
Downloads last month
217
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including knightnemo/nanowm-b2-dino-wm-pusht-100k