Public evidence map

Project-file-driven command execution and secret exposure

Public examples: Check Point research on Claude Code and Codex CLI CVEs — CVE-2025-59536, CVE-2026-21852, CVE-2025-61260.What happened: Repository-controlled configuration files caused AI coding tools to execute commands or expose credentials without explicit user awareness. In several cases the behavior was described as “documented” yet was practically invisible to users who had not read deep configuration references.

CVE-2025-61260 (Codex CLI command injection) carries a CVSS score of 9.8. A repository-controlled file was sufficient to trigger shell command execution with no additional user interaction required.

Why it matters: Trusted repository files can trigger command paths users do not expect. When a project can place or modify MCP server definitions, hook scripts, or agent instruction files, those files become an attacker-controlled execution surface.CodeGate capability families:

Cross-tool discovery (Layer 1)
Command-surface detection: COMMAND_EXEC, GIT_HOOK findings (Layer 2)
Environment override detection: ENV_OVERRIDE findings (Layer 2)
Wrapper recheck: codegate run blocks launch if any config file changed since the scan

Consent bypass and unsafe auto-approval

MCP poisoning and cross-tool toxic flows

Public examples: Invariant Labs tool-poisoning research; toxic-flow analyses across multi-agent setups.What happened: Tool descriptions returned by MCP servers were found to contain hidden instructions that manipulated downstream agent behavior. In multi-agent setups, a compromised upstream tool could influence the behavior of other agents in the pipeline without the user being aware.Why it matters: Tool descriptions and upstream metadata can manipulate downstream agent behavior. The attack surface is not just what the user types—it includes every tool description the agent reads.CodeGate capability families:

Deep scan (Layer 3, opt-in): fetches remote tool descriptions for analysis
Tool-description analysis: detects hidden instructions in MCP tool metadata
TOXIC_FLOW findings: flags cross-tool manipulation patterns
Rug-pull tracking: NEW_SERVER and CONFIG_CHANGE findings detect server changes between scans

Malicious skill and rule content in public ecosystems

Public examples: Snyk ToxicSkills campaign and related disclosures.What happened: Publicly available skill and rule markdown files were found to contain high-impact payloads embedded in normal-looking instruction text. Users who installed these skills exposed their agents to adversarial behavioral instructions without any visible warning.Why it matters: Instruction files can hide high-impact payloads in normal markdown. Content that appears to be documentation to a human reader functions as an adversarial prompt when consumed by an agent.CodeGate capability families:

Rule/skill maliciousness detection: RULE_INJECTION findings (Layer 2)
Unicode analysis: detects hidden characters used for visual spoofing (bidirectional override, zero-width joiners)
Local text analysis (Layer 3, opt-in): text-only instruction-file analysis via supported meta-agent
Suspicious pattern heuristics across discovered markdown surfaces

Compromised marketplace and extension integrity

Public examples: Open VSX advisories; JFrog research on Amazon Q extension compromise.What happened: Extensions in established marketplaces were found to have been modified or replaced with versions containing malicious behavior. The trust signal users rely on—“this is in the official marketplace”—was not sufficient to guarantee integrity.Why it matters: Supply-chain trust can fail even in established ecosystems. Installation from a known source does not guarantee the installed artifact is safe.CodeGate capability families:

Plugin and extension provenance checks (Layer 1/2)
Signature and attestation policy controls
Transparency checks for extension manifests

Public evidence map

Evidence themes

Incident groups

Scope and limits

​Evidence themes

​Incident groups

​Scope and limits

Evidence themes

Incident groups

Scope and limits