AI-powered agents are increasingly relied upon to execute tasks like code analysis, file management, and automating workflows.
However, a newly highlighted vulnerability argument injection shows how attackers can use these very capabilities to achieve remote code execution (RCE), even when certain safeguards are in place.
CVE ID | Product | Vulnerability |
---|---|---|
CVE-2025-54795 | Claude Code | Command injection in CLI agents |
How AI Agents Become Dangerous
AI agents often use pre-approved system tools, such as find, grep, and git, for efficiency and reliability.
These tools are considered “safe” and can run with minimal oversight. Developers avoid reinventing the wheel and gain performance and stability by leveraging thoroughly tested utilities.
But this design introduces a critical risk: argument injection. If user-controlled input is passed directly as a command argument without complete validation, attackers can inject malicious flags or options turning “safe” commands into a vehicle for exploitation.
In several production agent systems under review, attackers achieved RCE through clever prompt injections:
- Go Test Exploit: Attackers used the go test -exec flag to instruct the agent to launch curl and bash, even though those programs weren’t “safe” commands. Because go test was allowed, using its features as an argument bypassed restrictions completely.
- Complex Chaining: Another system allowed git show and ripgrep (rg). Attackers chained these tools, directing git show to drop a payload file, then used rg –pre bash to execute it—all with a single prompt.
- Facade Pattern Bypasses: Some agents use facade handlers that validate input before execution, but if these facades append raw user input without strict separators, attackers can slip in executable code, for example with fd -x=python3.
These attacks demonstrate that even sophisticated safety models such as requiring human approval for risky commands can fall short if argument injection isn’t tightly controlled.
The vulnerability is especially dangerous because it only takes a single cleverly-designed prompt to achieve RCE.
Experts recommend isolating AI agent operations from the main system using sandboxes, such as containers (Docker), WebAssembly, or platform-specific sandbox tools. This prevents escape if an agent is compromised.
If sandboxing isn’t possible, developers should:
- Implement facades that always separate user input using “–” to block flags (e.g., [cmd = “rg”, “–“, user_input]).
- Permanently disable shell execution via shell=False.
- Maintain the strictest possible allowlists—and review them against resources like GTFOBINS and LOLBINS for potential abuse.
- Audit command execution paths for argument injection opportunities.
- Log all executions and flag suspicious sequences for review.
Users should be wary of granting broad system access to AI agents, especially in environments with sensitive data.
Using containers or sandboxes even on local machines can limit the fallout from successful attacks.
Security engineers are encouraged to hunt for these flaws by examining agent command lists, reviewing code, and fuzzing for unexpected flag handling.
As AI agents become more capable, securing their command execution capabilities is critical. Argument injection a classic bug in a modern context shows that even with next-generation technology, the basics of input validation and isolation are as important as ever.
Follow us on Google News, LinkedIn, and X to Get Instant Updates and Set GBH as a Preferred Source in Google.