PINN-JEPA β Physics-Informed Encoder for 3D Human Motion (Step 1)
A physics-informed neural network (PINN) encoder for 3D skeletal human motion. The encoder is pretrained in a self-supervised way to produce motion representations whose kinematic structure (position β velocity β acceleration β jerk) and bone geometry stay physically consistent. This repository releases the Step 1 PINN pretraining stage (encoder body + pretraining head) and a non-latest checkpoint.
Status: research release. The published weights are not the latest internal checkpoint, and are intended for reproducibility and experimentation, not production.
Model overview
| Input | (B, T, J, 12) β per joint state [p(3), v(3), a(3), j(3)] |
| Skeleton | H36M 17-joint topology (J = 17) |
| Output | token features (B, T, J, D) + reconstructed state s_hat (B, T, J, 12) |
| Backbone | State embedding β (GraphMix spatial + TemporalBlock) Γ depth β LayerNorm |
| Default size | d_model = 256, depth = 6, d_state = 64 |
| Framework | PyTorch (custom modules, no transformers dependency) |
The encoder predicts a residual on position only; velocity, acceleration and jerk are derived analytically via central differences, which is what keeps the representation kinematically consistent rather than letting each channel drift independently.
Training objective (Step 1)
Self-supervised state reconstruction combined with physics-aware regularizers:
- State reconstruction on
p / v / a / j(weighted) - Bone-length consistency over skeleton edges
- Kinematic consistency (finite-difference agreement between channels)
- Jerk regularization for motion smoothness
See PINN_Lossfunction.py for exact terms and default weights.
Repository contents
PINN_EncoderBody.py # backbone (StateEmbedding, GraphMix, TemporalBlock, EncoderBody)
PINN_PretrainModel.py # Step 1 model: encoder + residual-p head -> s_hat
PINN_Lossfunction.py # physics-aware pretraining losses
PINN_Training.py # train step, checkpoint save/load
PINN_ModelEvaluation_downstream.py # representation-quality eval (clustering)
PINN_ModelEvaluation_itself4.py # model self-evaluation
PINN_visualization_for_model3.py # 3D skeleton render / input-vs-output compare
Utils.py # skeleton edges, central_diff, masked_mean, etc.
config.json # architecture hyperparameters (edit to match the checkpoint)
export_weights.py # slim a training checkpoint -> release weights
inference_example.py # minimal load + forward example
Note on imports. Modules use flat imports (
from Utils import ...). Keep all.pyfiles at the repository root, or add the repo root toPYTHONPATH, before importing.
Usage
import json, torch
from PINN_EncoderBody import EncoderBody
from PINN_PretrainModel import PINNPretrainModel
cfg = json.load(open("config.json"))
encoder = EncoderBody(**cfg["encoder"])
model = PINNPretrainModel(encoder=encoder, fps=cfg["fps"])
state = torch.load("pytorch_model.bin", map_location="cpu")
model.load_state_dict(state)
model.eval()
# x: (B, T, J=17, 12) = [p, v, a, j] per joint
out = model(x)
features = out["token_feat"] # (B, T, J, D) representation
s_hat = out["s_hat"] # (B, T, J, 12) reconstructed state
config.json ships with the architecture defaults. If the released checkpoint was
trained with different settings, edit config.json so the shapes match before loading.
Intended use
- Self-supervised motion representation learning research
- Feature extraction for downstream pose/motion tasks
- Studying physics-informed regularization for skeletal motion
Out of scope
- Not a clinical, diagnostic, biometric, or safety-critical tool
- Not trained or validated for person identification or surveillance
- Tuned for the H36M 17-joint topology; other skeletons need adaptation/retraining
Limitations
- Released weights are an older checkpoint and may underperform the internal latest version.
- Assumes a fixed 17-joint topology and a consistent
(p, v, a, j)input layout. fpsat inference should match the value used to build the(v, a, j)channels.- Evaluation utilities depend on
scikit-learn; UMAP is optional.
License
This release is distributed under the Academic Free License v3.0 (AFL-3.0).
- Source code (
*.py): AFL-3.0 β seeLICENSE. - Model weights (released checkpoint): AFL-3.0, with the disclaimer below.
The weights are provided "as is", for research and reproducibility, without warranty of
any kind. They are not the latest internal checkpoint and carry no fitness guarantee for
any particular use. See NOTICE for the scope split between code and weights.
Citation
@misc{pinn_jepa_pose,
title = {PINN-JEPA: Physics-Informed Encoder for 3D Human Motion},
author = {<Authors>},
year = {2026},
note = {Research code and weights, AFL-3.0},
howpublished = {\url{https://huggingface.co/<org>/<repo>}}
}
- Downloads last month
- 2