Machine learning pipelines increasingly power production-critical systems—from fraud detection and recommendation engines to autonomous systems and security analytics. But new research has uncovered a critical PyTorch vulnerability that turns a routine ML workflow into a remote code execution (RCE) vector.
Tracked as CVE-2026-24747, this flaw affects PyTorch’s checkpoint loading mechanism, allowing attackers to execute arbitrary code simply by convincing a user or system to load a malicious model file.
With a CVSS v3 score of 9.8, the vulnerability represents a near-worst-case scenario across confidentiality, integrity, and availability.
In this article, we’ll cover:
- What CVE-2026-24747 is and why it’s so dangerous
- How PyTorch checkpoint loading works—and fails
- The technical root causes behind memory corruption
- Realistic attack scenarios in ML environments
- Concrete mitigation steps for security and ML teams
What Is CVE-2026-24747?
Vulnerability Summary
CVE-2026-24747 is a critical memory corruption vulnerability in PyTorch that can lead to remote code execution when loading malicious checkpoint files (.pth).
The issue resides in PyTorch’s weights_only unpickler, a feature designed to safely deserialize model weights by limiting Python pickle operations.
Despite these safeguards, researchers found that the unpickler fails to properly validate pickle opcodes and storage metadata, enabling attackers to bypass protections entirely.
Affected PyTorch Versions
The vulnerability impacts:
- PyTorch 2.9.1 and earlier
The issue has been patched in PyTorch 2.10.0 and later, where additional validation and safety checks have been implemented.
Why This Vulnerability Is So Severe
CVSS Breakdown
- Severity: Critical
- CVSS v3 Score: 9.8
- Attack Vector: Network
- Attack Complexity: Low
- Privileges Required: None
- User Interaction: Required
The only prerequisite is that a victim loads a malicious checkpoint file—a common and often automated operation in ML workflows.
Once triggered, the malicious payload executes with the same privileges as the PyTorch process, which in many environments means full access to:
- GPU resources
- Training data
- Model artifacts
- Cloud credentials
- Production infrastructure
How PyTorch Checkpoint Loading Works
The Role of Pickle in PyTorch
PyTorch uses Python’s pickle serialization format to store and load model checkpoints. These checkpoints often include:
- Model weights
- Optimizer states
- Training metadata
Because pickle is inherently unsafe, PyTorch introduced the weights_only=True option to limit deserialization to what should be “safe” operations.
Unfortunately, this assumption proved flawed.
Root Cause: Where the Security Model Breaks
The vulnerability arises from inadequate validation in two key areas of the unpickling process.
1. Unsafe Pickle Opcode Handling
Attackers can abuse the SETITEM and SETITEMS pickle opcodes by applying them to non-dictionary objects.
This leads to:
- Heap memory corruption
- Unexpected memory writes
- Overwriting of internal object structures
2. Storage Metadata Mismatches
Malicious checkpoint files can declare:
- A storage element count that does not match the actual data size
This discrepancy allows:
- Writes beyond allocated memory boundaries
- Corruption of adjacent memory regions
Together, these issues create a reliable pathway to arbitrary code execution during deserialization.
Exploitation Flow: Step by Step
- Attacker crafts a malicious
.pthcheckpoint - The file is distributed via:
- Public model repositories
- Shared research artifacts
- Compromised internal pipelines
- Victim loads the model using:
torch.load("model.pth", weights_only=True) - The unpickler processes the malicious payload
- Memory corruption occurs
- Attacker-controlled code executes inside the PyTorch process
No sandbox escape.
No exploit chain.
Just a poisoned model file.
Real-World Attack Scenarios
Supply Chain Attacks on ML Models
This vulnerability is especially dangerous in ecosystems where teams routinely download pre-trained models from:
- Open-source repositories
- Research communities
- Third-party vendors
A single compromised model can infect:
- CI/CD pipelines
- Training clusters
- Production inference services
Multi-Tenant and Cloud ML Environments
In shared GPU clusters or cloud ML platforms, RCE can enable attackers to:
- Access other tenants’ data
- Steal proprietary models
- Pivot into broader cloud infrastructure
Why This Is an ML Security Wake-Up Call
Unlike traditional application vulnerabilities, ML workflows blur the line between data and code.
Model files are often treated as:
- Static artifacts
- Trusted research outputs
CVE-2026-24747 demonstrates that model loading is code execution—and must be treated with the same rigor as running third-party software.
Detection and Incident Response Challenges
Why This Is Hard to Catch
- Malicious checkpoints look like legitimate model files
- Exploitation occurs during normal load operations
- No obvious indicators at the network or OS level
Without file integrity controls or strict provenance tracking, attacks can go unnoticed until after compromise.
Mitigation and Best Practices
1. Upgrade PyTorch Immediately
Primary mitigation:
- Upgrade to PyTorch 2.10.0 or later
This version includes:
- Proper pickle opcode validation
- Storage metadata integrity checks
2. Treat Model Files as Untrusted Input
- Never load checkpoints from unknown sources
- Validate model provenance and integrity
- Use cryptographic signing or hashing
3. Harden Model Distribution Pipelines
- Restrict who can publish models
- Enforce approval workflows
- Scan model artifacts before promotion to production
4. Apply Network and Runtime Controls
- Block outbound connections from training jobs where possible
- Run PyTorch processes with least privilege
- Use container isolation and runtime monitoring
Compliance and Risk Implications
For regulated industries, this vulnerability introduces:
- Supply chain risk
- Data breach exposure
- Model theft
- Audit and compliance failures
An exploited ML pipeline can undermine both security posture and intellectual property protections.
Frequently Asked Questions (FAQs)
What is CVE-2026-24747?
A critical PyTorch vulnerability that enables remote code execution via malicious checkpoint files.
Which PyTorch versions are affected?
All versions up to and including 2.9.1.
Is weights_only=True safe?
Not in vulnerable versions. The restriction can be bypassed using crafted pickle payloads.
Is this attack remote?
Yes. Attackers can distribute malicious models over the network via repositories or shared pipelines.
How do I mitigate this risk?
Upgrade to PyTorch 2.10.0+, restrict model sources, and secure model distribution workflows.
Conclusion
CVE-2026-24747 is one of the most serious PyTorch vulnerabilities to date, exposing a fundamental weakness in how machine learning systems handle untrusted model artifacts.
The combination of:
- Low attack complexity
- Network-based distribution
- Full remote code execution
makes this vulnerability especially dangerous for research, enterprise, and production ML environments.
Security teams and ML engineers must work together to ensure:
- PyTorch deployments are fully patched
- Model files are treated as executable input
- ML pipelines follow zero-trust principles
In modern AI systems, models are code—and code must be secured.