Critical Anthropic Sandbox Bug Allowed Quiet Code Theft

AI firm Anthropic is facing intense scrutiny from the cybersecurity community after quietly patching a second major security bypass in its Claude Code network sandbox without notifying its users.

Security researchers argue that the company’s decision to withhold public advisories or CVEs (Common Vulnerabilities and Exposures) pushes the heavy burden of data security directly onto unsuspecting developers. While Anthropic frequently boasts that its internal bug-hunting AI tools can uncover hundreds of vulnerabilities, the self-styled ethical frontier LLM maker appears remarkably quiet when it comes to acknowledging architectural flaws in its own commercial products.

How the “Null-Byte” Sandbox Bypass Works

The flaw was discovered and reported by Aonan Guan, the lead of cloud and AI security at Wyze Labs. It marks the second major sandbox vulnerability Guan has unearthed in Anthropic’s terminal-based coding assistant over a six-month period.

The security flaw targets the network sandbox, which acts as a security cage intended to prevent AI-generated or executed code from interacting with unapproved external servers.

[Attacker Hostname] ---> attacker-host.com\x00.google.com

1. Claude Sandbox Proxy checks suffix (*.google.com)   ---> [APPROVED]
2. Operating System resolves string (stops at \x00)     ---> [DIALS attacker-host.com]

This structural mismatch meant a malicious actor could completely bypass a company’s outbound network restrictions, establishing a silent backend channel to an attacker-controlled server.

The Ultimate Nightmare: Chaining with Prompt Injection

According to Guan, this sandbox vulnerability becomes weaponized when paired with indirect prompt injection—a vector he previously exposed in a research paper titled Comment and Control.

Because Claude Code actively reads external files to assist developers, an attacker does not need direct access to a local machine to trigger an exploit. Instead, they can hide malicious, invisible instructions in everyday developer environments:

A comment on a public GitHub issue
A project README file
An online documentation or product page

When Claude Code parses the compromised page, the hidden injection forces the tool to run attacker-controlled code inside the sandbox. Because the network sandbox was broken, the rogue code could silently exfiltrate highly sensitive data, including:

Local cloud provider metadata and internal enterprise APIs
Development machine environment variables and source code
The actual GitHub authentication tokens Claude uses to access private repositories

Impact Scope: The vulnerability was live for roughly 5.5 months, affecting every single version of Claude Code released from v2.0.24 through v2.1.89—spanning approximately 130 published builds.

A “Painful” History of Silent Patches

Guan officially reported the zero-day exploit to Anthropic via the HackerOne bug bounty platform on April 3. The response from the AI vendor was swift but dismissive.

The following day, Anthropic closed the ticket as a “duplicate of an internal finding,” stating they had already caught and mitigated the bug on their end. When Guan pressed for public transparency, the firm stated it had “not yet decided” if a formal CVE would be assigned, refusing to provide a release timeline.

Issue	Vulnerability Type	Public CVE Assigned?	User Notification Method
Finding 1	Empty domain array allowed all traffic	Yes (`CVE-2025-66479`)	None. Assigned to an obscure backend runtime library.
Finding 2	SOCKS5 Hostname Null-Byte Injection	No	None. Handled as a silent code commit.

To date, Anthropic has provided no public advisory, no security note in their changelogs, and zero direct user outreach via email or dashboard warnings. Organizations running vulnerable deployments have no explicit way of knowing their perimeter was broken.

The False Sense of AI Security

This pattern of secrecy highlights a broader, troubling trend among frontier AI vendors. Guan, alongside researchers from Johns Hopkins University, recently demonstrated how they successfully hijacked three popular GitHub-integrated AI agents. While tech giants like Anthropic, Google, and Microsoft paid out modest bug bounties, none assigned CVEs or issued public warnings.

This lack of transparency distorts enterprise risk assessments, leaving companies to trust boundaries that don’t actually exist. As Guan notes regarding the danger of unmapped flaws in AI safety guardrails:

“Shipping a sandbox with a hole is worse than not shipping one. The user with no sandbox knows they have no boundary. The user with a broken sandbox thinks they do.”

Remediation Note: If you deploy Claude Code within your engineering pipelines, manually check your environment (claude --version). Security teams must ensure all clients are upgraded to v2.1.90 or later to close the null-byte injection vector, and aggressively audit outbound network traffic logs for unauthorized SOCKS5 data transfers.

This Crucial Anthropic Bug Lets Criminals Steal Code: Inside the Silent Claude Code Sandbox Bypass

How the “Null-Byte” Sandbox Bypass Works

The Ultimate Nightmare: Chaining with Prompt Injection

A “Painful” History of Silent Patches

The False Sense of AI Security

Leave a Reply Cancel reply