PolicyTrim

Boosting Intrinsic Policy Efficiency of Vision-Language-Action Models

Paper GitHub Project Page Hugging Face Paper License

Xianghui Wang*, Feng Chen*, Wenbo Zhang, Hua Yan, Zixuan Wang, Changsheng Li, Yinjie Lei

* Equal contribution · Project lead · Corresponding author

Model Card

This repository provides the released post-training actor checkpoints for PolicyTrim, a two-stage reinforcement learning framework for improving the intrinsic policy efficiency of Vision-Language-Action (VLA) models.

Most deployment-efficiency methods reduce the latency of each model forward pass. PolicyTrim instead reduces how many inference calls and physical actions are required to finish a task. It targets two policy-level bottlenecks:

  1. unreliable predictions near the tail of an action chunk;
  2. redundant physical execution steps and corrective actions.

PolicyTrim first extends the reliable executable action horizon, then applies a redundancy-aware step-saving objective with stability regularization. Across three benchmarks and three VLA model families, the method reports:

  • 3x improvement in action chunk utilization;
  • 51.4% reduction in physical execution steps;
  • up to 5.83x end-to-end deployment speedup;
  • no compromise in task success rates.

For the method, training code, configuration files, and evaluation scripts, see the PolicyTrim GitHub repository.

PolicyTrim overview

Resources

Download

Install the Hugging Face Hub CLI:

pip install -U huggingface_hub

Download the complete repository:

hf download INCEPTIONwang/PolicyTrim \
  --local-dir ./PolicyTrim-checkpoints

The complete repository is large. To download only one checkpoint, specify its path. For example:

hf download INCEPTIONwang/PolicyTrim \
  libero_goal_grpo_openpi_pi05/checkpoints/global_step_500/actor/model_state_dict/full_weights.pt \
  --local-dir ./PolicyTrim-checkpoints

Python equivalent:

from huggingface_hub import hf_hub_download

checkpoint_path = hf_hub_download(
    repo_id="INCEPTIONwang/PolicyTrim",
    filename=(
        "libero_goal_grpo_openpi_pi05/checkpoints/global_step_500/"
        "actor/model_state_dict/full_weights.pt"
    ),
)

Loading and Evaluation

Checkpoint restoration depends on the matching VLA backend and distributed training configuration. Follow the setup and evaluation instructions in the GitHub README, then point the corresponding PolicyTrim configuration to the downloaded checkpoint.

License

The released materials are provided under the Apache License 2.0. Users are also responsible for complying with the licenses and terms of the corresponding base VLA models, datasets, and simulation environments.

Citation

If you find PolicyTrim useful, please cite:

@inproceedings{policytrim2026,
  title     = {PolicyTrim: Boosting Intrinsic Policy Efficiency of Vision-Language-Action Models},
  author    = {Xianghui Wang and Feng Chen and Wenbo Zhang and Hua Yan and Zixuan Wang and Changsheng Li and Yinjie Lei},
  booktitle = {European Conference on Computer Vision (ECCV)},
  year      = {2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Paper for INCEPTIONwang/PolicyTrim