Detection layers
Layers 1 and 2 — Offline-first and deterministic
Layers 1 and 2 run on every scan. They make no network requests. Results are deterministic given the same input files and rule configuration.
COMMAND_EXEC, ENV_OVERRIDE, CONSENT_BYPASS, RULE_INJECTION, IDE_SETTINGS, SYMLINK_ESCAPE, GIT_HOOK, NEW_SERVER, and CONFIG_CHANGE. Unicode analysis runs across rule-file and tool-description content to detect visually hidden characters.
Because Layers 1 and 2 are offline-first, they have no dependency on network availability, no exposure from fetching remote content, and no interaction with external services.
Layer 3 — Opt-in, consent-driven, increases exposure
Layer 3 only runs when you pass
--deep. Each eligible remote resource requires explicit per-resource consent before CodeGate fetches or analyzes it. Skipped consent is recorded in the output for auditability.- Discovery of eligible external resources from known config paths.
- Discovery of eligible local instruction files from the selected markdown/text scan surface.
- Remote tool-description fetching and analysis (MCP server endpoints).
- Meta-agent analysis of local instruction files (
AGENTS.md,CODEX.md, discovered rule/skill markdown) via a supported local tool (currently Claude Code for text-only analysis). TOXIC_FLOWfindings for cross-tool manipulation patterns.PARSE_ERRORfindings when Layer 3 content cannot be parsed.
Layer 4 — Remediation (opt-in)
Remediation is best-effort and reversible. Backup sessions are written to
.codegate-backup/. Use codegate undo to restore the most recent session. Review all changes before committing them.--remediate: guided file remediation with prompts.--fix-safe: auto-applies unambiguous critical fixes without prompting.--dry-run: shows proposed changes without writing anything.--patch: generates a patch file for external review workflows.
--fix-safe, users should review the changes before proceeding. Remediation is best-effort: not every finding has a safe automatic fix, and the scanner cannot verify whether a fix is correct in all project contexts.
What CodeGate does not claim
CodeGate is designed to improve visibility and decision quality. It is not designed to function as an absolute safety net.
- Not a guarantee of safety. A clean scan result means CodeGate found no known patterns matching its current rule set. It does not mean the project is safe.
- Not a replacement for secure engineering judgment, review, and hardening. Human review of configuration files, MCP server definitions, hooks, and instruction content is still necessary.
- Not perfect. CodeGate can produce false positives (flagging benign content) and false negatives (missing malicious content). Both are expected and documented.
- Not a promise that every malicious pattern will be detected. New attack techniques can appear before signatures and heuristics are updated. Rule packs can be extended with
rule_pack_paths, but coverage is never complete.
False positives and false negatives
Both are inherent to pattern-based detection. False positives occur when CodeGate flags content that is not actually malicious. Common causes include legitimate uses of shell commands in hooks, MCP servers from known safe providers that are not in theknown_safe_mcp_servers list, and rule/skill markdown that uses assertive language for valid reasons.
Suppress known-safe findings using suppress_findings (by finding ID or fingerprint) or suppression_rules (by rule ID, file path, severity, category, CWE, or fingerprint with AND semantics).
False negatives occur when CodeGate does not flag content that is malicious. Causes include novel attack patterns not yet in the rule set, obfuscation techniques that evade static analysis, and content that only becomes risky at runtime or through multi-step agent reasoning.
Layer 3 deep scan improves coverage for remote and dynamic surfaces, but introduces its own limitations.
Guiding principles
From why-codegate.md:- Inspect before trust. Scan first; decide whether to trust and run after reviewing findings.
- Prefer explicit user consent over silent execution. Layer 3 resources require per-resource consent.
codegate runrequires confirmation for warning-level findings unless explicitly bypassed. - Keep high-risk operations explainable and reviewable. Every finding includes a category, severity, and location. Remediation is backed by a reversible backup.
- Treat documented risk as real risk when users are likely to miss it. “This is documented behavior” is not sufficient justification when execution surfaces are broad and defaults are permissive.
- Preserve operator control with backups, undo, and policy thresholds. Config keys like
severity_threshold,trusted_directories,blocked_commands, andsuppression_rulesput policy decisions in operator hands.