Posted in

PyTorch RCE Vulnerability Actively Exploited: CVE-2026-24747

Machine learning pipelines increasingly power production-critical systems—from fraud detection and recommendation engines to autonomous systems and security analytics. But new research has uncovered a critical PyTorch vulnerability that turns a routine ML workflow into a remote code execution (RCE) vector.

Tracked as CVE-2026-24747, this flaw affects PyTorch’s checkpoint loading mechanism, allowing attackers to execute arbitrary code simply by convincing a user or system to load a malicious model file.

With a CVSS v3 score of 9.8, the vulnerability represents a near-worst-case scenario across confidentiality, integrity, and availability.

In this article, we’ll cover:

  • What CVE-2026-24747 is and why it’s so dangerous
  • How PyTorch checkpoint loading works—and fails
  • The technical root causes behind memory corruption
  • Realistic attack scenarios in ML environments
  • Concrete mitigation steps for security and ML teams

What Is CVE-2026-24747?

Vulnerability Summary

CVE-2026-24747 is a critical memory corruption vulnerability in PyTorch that can lead to remote code execution when loading malicious checkpoint files (.pth).

The issue resides in PyTorch’s weights_only unpickler, a feature designed to safely deserialize model weights by limiting Python pickle operations.

Despite these safeguards, researchers found that the unpickler fails to properly validate pickle opcodes and storage metadata, enabling attackers to bypass protections entirely.


Affected PyTorch Versions

The vulnerability impacts:

  • PyTorch 2.9.1 and earlier

The issue has been patched in PyTorch 2.10.0 and later, where additional validation and safety checks have been implemented.


Why This Vulnerability Is So Severe

CVSS Breakdown

  • Severity: Critical
  • CVSS v3 Score: 9.8
  • Attack Vector: Network
  • Attack Complexity: Low
  • Privileges Required: None
  • User Interaction: Required

The only prerequisite is that a victim loads a malicious checkpoint file—a common and often automated operation in ML workflows.

Once triggered, the malicious payload executes with the same privileges as the PyTorch process, which in many environments means full access to:

  • GPU resources
  • Training data
  • Model artifacts
  • Cloud credentials
  • Production infrastructure

How PyTorch Checkpoint Loading Works

The Role of Pickle in PyTorch

PyTorch uses Python’s pickle serialization format to store and load model checkpoints. These checkpoints often include:

  • Model weights
  • Optimizer states
  • Training metadata

Because pickle is inherently unsafe, PyTorch introduced the weights_only=True option to limit deserialization to what should be “safe” operations.

Unfortunately, this assumption proved flawed.


Root Cause: Where the Security Model Breaks

The vulnerability arises from inadequate validation in two key areas of the unpickling process.

1. Unsafe Pickle Opcode Handling

Attackers can abuse the SETITEM and SETITEMS pickle opcodes by applying them to non-dictionary objects.

This leads to:

  • Heap memory corruption
  • Unexpected memory writes
  • Overwriting of internal object structures

2. Storage Metadata Mismatches

Malicious checkpoint files can declare:

  • A storage element count that does not match the actual data size

This discrepancy allows:

  • Writes beyond allocated memory boundaries
  • Corruption of adjacent memory regions

Together, these issues create a reliable pathway to arbitrary code execution during deserialization.


Exploitation Flow: Step by Step

  1. Attacker crafts a malicious .pth checkpoint
  2. The file is distributed via:
    • Public model repositories
    • Shared research artifacts
    • Compromised internal pipelines
  3. Victim loads the model using: torch.load("model.pth", weights_only=True)
  4. The unpickler processes the malicious payload
  5. Memory corruption occurs
  6. Attacker-controlled code executes inside the PyTorch process

No sandbox escape.
No exploit chain.
Just a poisoned model file.


Real-World Attack Scenarios

Supply Chain Attacks on ML Models

This vulnerability is especially dangerous in ecosystems where teams routinely download pre-trained models from:

  • Open-source repositories
  • Research communities
  • Third-party vendors

A single compromised model can infect:

  • CI/CD pipelines
  • Training clusters
  • Production inference services

Multi-Tenant and Cloud ML Environments

In shared GPU clusters or cloud ML platforms, RCE can enable attackers to:

  • Access other tenants’ data
  • Steal proprietary models
  • Pivot into broader cloud infrastructure

Why This Is an ML Security Wake-Up Call

Unlike traditional application vulnerabilities, ML workflows blur the line between data and code.

Model files are often treated as:

  • Static artifacts
  • Trusted research outputs

CVE-2026-24747 demonstrates that model loading is code execution—and must be treated with the same rigor as running third-party software.


Detection and Incident Response Challenges

Why This Is Hard to Catch

  • Malicious checkpoints look like legitimate model files
  • Exploitation occurs during normal load operations
  • No obvious indicators at the network or OS level

Without file integrity controls or strict provenance tracking, attacks can go unnoticed until after compromise.


Mitigation and Best Practices

1. Upgrade PyTorch Immediately

Primary mitigation:

  • Upgrade to PyTorch 2.10.0 or later

This version includes:

  • Proper pickle opcode validation
  • Storage metadata integrity checks

2. Treat Model Files as Untrusted Input

  • Never load checkpoints from unknown sources
  • Validate model provenance and integrity
  • Use cryptographic signing or hashing

3. Harden Model Distribution Pipelines

  • Restrict who can publish models
  • Enforce approval workflows
  • Scan model artifacts before promotion to production

4. Apply Network and Runtime Controls

  • Block outbound connections from training jobs where possible
  • Run PyTorch processes with least privilege
  • Use container isolation and runtime monitoring

Compliance and Risk Implications

For regulated industries, this vulnerability introduces:

  • Supply chain risk
  • Data breach exposure
  • Model theft
  • Audit and compliance failures

An exploited ML pipeline can undermine both security posture and intellectual property protections.


Frequently Asked Questions (FAQs)

What is CVE-2026-24747?

A critical PyTorch vulnerability that enables remote code execution via malicious checkpoint files.

Which PyTorch versions are affected?

All versions up to and including 2.9.1.

Is weights_only=True safe?

Not in vulnerable versions. The restriction can be bypassed using crafted pickle payloads.

Is this attack remote?

Yes. Attackers can distribute malicious models over the network via repositories or shared pipelines.

How do I mitigate this risk?

Upgrade to PyTorch 2.10.0+, restrict model sources, and secure model distribution workflows.


Conclusion

CVE-2026-24747 is one of the most serious PyTorch vulnerabilities to date, exposing a fundamental weakness in how machine learning systems handle untrusted model artifacts.

The combination of:

  • Low attack complexity
  • Network-based distribution
  • Full remote code execution

makes this vulnerability especially dangerous for research, enterprise, and production ML environments.

Security teams and ML engineers must work together to ensure:

  • PyTorch deployments are fully patched
  • Model files are treated as executable input
  • ML pipelines follow zero-trust principles

In modern AI systems, models are code—and code must be secured.

Leave a Reply

Your email address will not be published. Required fields are marked *