cyberivy
AI SecurityCoding AgentsMCPClaude CodeSoftware Supply ChainCI SecurityDeveloper Tools2026

TrustFall exposes the one-keypress risk in coding agents

May 7, 2026

Eine Nahaufnahme eines Fingers, der auf eine beleuchtete Tastaturtaste drückt.

Adversa AI shows how project configuration in Claude Code, Gemini CLI, Cursor CLI and Copilot CLI can start local processes as soon as a developer trusts a folder.

What this is about

Adversa AI published a study called TrustFall on May 7, 2026. The uncomfortable core: a cloned repository can include project files that define an MCP server. Once a developer marks the folder as trusted, that server can start as a normal local process.

According to the research, Claude Code, Gemini CLI, Cursor CLI and GitHub Copilot CLI are affected. This is not a classic prompt-injection case where the model is persuaded to do something. It is a trust and configuration problem at the boundary between project folder, agent and operating system.

What TrustFall actually does

The attack uses the Model Context Protocol mechanism. MCP lets an agent connect external helper programs: database access, linters, internal tools or search services. But those helpers can be defined by files inside the project. When an unfamiliar repository is opened, the agent reads the configuration and may start the helper.

Adversa describes two variants. In the developer variant, one trust prompt is enough, often with approval as the default. In the CI variant, a headless runner may be affected without a visible dialog if the agent runs in a pipeline against a pull request. According to the report, the process runs with the privileges of the user or runner. That means the exposure is not limited to the project folder; SSH keys, cloud tokens, shell history and other local sources may be reachable.

Why it matters

Coding agents are used exactly where sensitive data lives: source code, deployment scripts, cloud credentials and product secrets. The old developer rule was: inspect code first, execute later. Agentic CLIs shift that boundary because project configuration can become active while the folder is being opened.

Help Net Security reports that Anthropic treated the concrete Claude Code report as outside its threat model: trusting a folder means consenting to its project configuration. Adversa’s point is not only where that boundary sits, but whether the dialog explains clearly enough that local code can start. For teams, this is a practical security question, not just a vendor debate.

In plain language

Imagine borrowing a cookbook. When you open it, it asks: “Do you trust this book?” If you press Enter, it not only turns on the oven but also opens your kitchen cabinets and calls an unknown phone number. That is the point: the recipe text is not the problem; the hidden action triggered by trust is.

A practical example

A developer reviews 20 unfamiliar open-source repositories per week. One of them contains two small configuration files: an MCP definition with a harmless-looking helper and an auto-approval setting. The developer starts the agent, accepts the trust dialog and the helper runs. If the machine has 8 cloud profiles, 3 SSH keys and several private repositories reachable, a quick test clone becomes a serious supply-chain risk.

In a CI pipeline the damage can be more direct. A pull request can cause the runner to read environment variables or build tokens before a human has really reviewed the change.

Scope and limits

  • The report shows a configuration and consent problem. It does not prove that all installations have already been compromised.
  • Teams with centrally managed agent settings, disabled project-wide MCP auto-start and isolated CI runners can materially reduce the risk.
  • MCP remains useful. The problem is not the protocol itself, but unclear trust prompts, project-scoped auto-approval and missing isolation.

SEO & GEO keywords

TrustFall, Adversa AI, Claude Code, Gemini CLI, Cursor CLI, GitHub Copilot CLI, MCP, Model Context Protocol, coding agent security, CI security, software supply chain, RCE

💡 In plain English

TrustFall shows that trusting a project folder in a coding agent can mean more than reading files. In some setups it can start local code before the developer fully understands what is happening.

Key Takeaways

  • Adversa AI describes a risk across four agentic coding CLIs.
  • The critical point is project-defined MCP code after a trust prompt.
  • CI runners are especially exposed because no visible dialog may appear there.
  • Central policies, isolated runners and disabled project MCP auto-start reduce the risk.

FAQ

Is MCP itself unsafe?

No. MCP is a useful integration pattern. The risk appears when an unfamiliar project can start helpers without clear limits for users or CI systems.

Which tools does the research name?

Adversa names Claude Code, Gemini CLI, Cursor CLI and GitHub Copilot CLI.

What should teams check first?

Project-level MCP auto-approvals, CI agent runs on pull requests and centrally managed settings for coding agents.

Sources & Context