Enterprise AI Security Checklist for 2026 — 8 Gates You Need
We hardened our AI platform across 16 sprints and found 2 critical RCE vulnerabilities that would have allowed arbitrary code execution on our production server. AI systems introduce attack vectors that traditional security tooling doesn’t catch: eval() in LLM output handlers, unsandboxed exec() in code generation pipelines, and shell injection through agent tool calls. Here are the 8 security gates we deployed — and how to wire them into your CI/CD pipeline.
Most enterprise AI security guides focus on prompt injection and data poisoning. Those are real risks. But the vulnerabilities we found were more mundane and more dangerous: a python3 -c call that let the AI agent execute arbitrary Python on the host, and an exec() call in the code generation pipeline that ran LLM output without sandboxing. Both were CRITICAL severity — full remote code execution with the privileges of the service account.
The 8 Security Gates
These gates run in our GitLab CI pipeline on every commit. A failure in any gate blocks the merge request. The total pipeline runs in approximately 3 minutes across 14 jobs and 5 stages.
Gate 1: Bandit SAST
What it catches: Python security anti-patterns — hardcoded passwords, insecure hash functions, SQL injection, dangerous imports, assert statements in production code.
Configuration: Blocks on HIGH severity. MEDIUM findings are logged but don’t block. We maintain a false-positive filter in CI to suppress known-good patterns (e.g., hashlib.md5 used for non-security checksums).
# .gitlab-ci.yml — Bandit SAST gate
bandit_sast:
stage: security
script:
- pip install bandit
- bandit -r platform/ projects/ -f json -o bandit-report.json
--severity-level high --confidence-level medium
- python ci/filter_bandit_fps.py bandit-report.json
allow_failure: false
artifacts:
reports:
sast: bandit-report.json
Real Finding
Bandit flagged subprocess.call(cmd, shell=True) in a deployment script. The cmd variable included user-provided branch names without sanitization. An attacker could craft a branch name like main; rm -rf / to execute arbitrary commands during deployment.
Gate 2: detect-secrets
What it catches: API keys, tokens, passwords, private keys accidentally committed to the repository. Scans all staged files for patterns matching 20+ secret types (AWS, GCP, Slack, Stripe, generic high-entropy strings).
# .gitlab-ci.yml — Secrets scanning gate
detect_secrets:
stage: security
script:
- pip install detect-secrets
- detect-secrets scan --all-files --exclude-files '\.lock$'
--baseline .secrets.baseline
- detect-secrets audit --report --json .secrets.baseline
allow_failure: false
We maintain a .secrets.baseline file that marks known false positives (e.g., example API keys in documentation, test fixtures with dummy tokens). Every new detection requires explicit triage: either fix it or add it to the baseline with a justification comment.
Gate 3: Semgrep (Auto + Custom Rules)
What it catches: Language-specific vulnerabilities, insecure patterns, and custom rules tailored to our codebase. Semgrep runs twice: once with the p/python auto ruleset, once with our custom rules.
Custom rules we wrote:
no-eval-exec: Blocks any use ofeval(),exec(), orcompile()on non-literal stringsno-pickle-loads: Blockspickle.loads()andpickle.load()(deserialization RCE vector)no-shell-true: Blockssubprocess.*(shell=True)except in explicitly allowlisted fileswebsocket-origin-check: Requires origin validation in WebSocket handlers
# .semgrep/custom-rules.yml
rules:
- id: no-eval-exec
patterns:
- pattern-either:
- pattern: eval(...)
- pattern: exec(...)
- pattern: compile(..., ..., "exec")
message: "eval/exec/compile detected — RCE risk. Use ast.literal_eval() or structured parsing."
severity: ERROR
languages: [python]
Gate 4: eval()/exec()/pickle Elimination
Why a dedicated gate? This is the #1 attack vector in AI systems. LLMs generate code. If that code passes through eval() or exec() before validation, the LLM’s output becomes executable code on your server. This is not a theoretical risk — we found it in production.
The gate is a simple grep + allowlist check that runs independently of Semgrep (defense in depth):
# ci/check_dangerous_functions.py
DANGEROUS = ['eval(', 'exec(', 'pickle.loads(', 'pickle.load(',
'__import__(', 'compile(']
ALLOWLIST = ['platform/tests/', 'docs/examples/']
for file in changed_files:
if any(file.startswith(a) for a in ALLOWLIST):
continue
content = open(file).read()
for func in DANGEROUS:
if func in content:
# Check if it's in a comment or string literal
if not is_in_comment_or_string(content, func):
fail(f"BLOCKED: {func} in {file}")
Gate 5: Shell Injection Prevention
What it enforces: All subprocess calls must use shell=False (the default) with argument lists, not shell strings. This prevents command injection through unsanitized inputs.
| Pattern | Status | Risk |
|---|---|---|
subprocess.run(["git", "pull"], shell=False) | ALLOWED | None — arguments are list elements |
subprocess.run(f"git pull {branch}", shell=True) | BLOCKED | Shell injection via branch name |
os.system(cmd) | BLOCKED | Always uses shell, no argument separation |
os.popen(cmd) | BLOCKED | Shell execution, deprecated |
Gate 6: WebSocket Origin Validation
What it prevents: Cross-site WebSocket hijacking (CSWSH). Without origin validation, any website can open a WebSocket connection to your AI platform and issue commands as the authenticated user.
Our NEXUS Bridge server (458 RPC methods, port 9800) validates the Origin header on every WebSocket upgrade request:
# bridge/server.py — Origin validation
ALLOWED_ORIGINS = [
"https://nexus.zeltrex.com",
"https://tab.zeltrex.com",
"https://phone.zeltrex.com",
"http://localhost:3333", # Development only
]
async def on_connect(self, websocket):
origin = websocket.request_headers.get("Origin", "")
if origin and origin not in ALLOWED_ORIGINS:
await websocket.close(4003, "Origin not allowed")
logger.warning(f"Rejected WebSocket from origin: {origin}")
return
Gate 7: PBKDF2 Authentication
What it replaces: Plaintext password storage and comparison. Our initial prototype stored API credentials as plaintext in config files. The security audit caught this and we migrated to PBKDF2-HMAC-SHA256 with 600,000 iterations.
Where it applies: Bridge server authentication, admin panel access, inter-service tokens. All secrets are now stored in GCP Secret Manager (44 keys across 7 entity bundles) with local fallback to encrypted files.
# platform/auth/password.py
import hashlib, os
def hash_password(password: str, salt: bytes = None) -> tuple[str, str]:
salt = salt or os.urandom(32)
key = hashlib.pbkdf2_hmac('sha256', password.encode(),
salt, iterations=600_000)
return salt.hex(), key.hex()
def verify_password(password: str, salt_hex: str, key_hex: str) -> bool:
_, computed = hash_password(password, bytes.fromhex(salt_hex))
return computed == key_hex
Gate 8: Constitutional AI Checker (Night Shift Merge Gate)
What it does: Before any Night Shift output is merged into the codebase, a constitutional checker validates the output against safety rules. This is the last line of defense against an autonomous AI agent producing malicious or dangerous code.
The constitutional checker enforces:
- No dangerous function calls (
eval,exec,os.system,subprocesswithshell=True) - No network calls to unknown hosts (allowlist of approved domains)
- No file system operations outside the project directory
- No credential access patterns (reading from
.secrets/, environment variables matching secret patterns) - Quality score ≥ 4.0/10 from the hybrid assessor
Why Constitutional AI for Code?
Traditional SAST tools scan code written by humans. Constitutional checkers scan code written by AI. The difference matters: an AI agent generating code might produce syntactically correct, functionally useful code that also includes a backdoor, exfiltration channel, or privilege escalation. The constitutional checker specifically targets AI-generated attack patterns that SAST tools don’t model.
The 2 Critical Vulnerabilities We Found
CVE-Class 1: RCE via python3 -c in Agent Executor
Night Shift’s task executor had a code path that constructed Python commands dynamically and executed them via subprocess.run(f"python3 -c '{code}'", shell=True). The code variable came from the LLM’s output — meaning the AI agent could execute arbitrary Python on the host server.
Impact: CRITICAL. Full RCE with the service account’s privileges (which included read/write to the entire codebase and access to GCP Secret Manager).
Fix: Replaced python3 -c with a structured execution sandbox that:
- Parses the LLM output as an AST (Abstract Syntax Tree) before execution
- Blocks any AST nodes containing
Import,Callto dangerous functions, orAttributeaccess toos/subprocess/sys - Runs in a subprocess with reduced privileges (no network, no filesystem access outside temp directory)
CVE-Class 2: Unsandboxed exec() in Code Generation Pipeline
The code generation pipeline used exec() to validate that LLM-generated code was syntactically correct. The validation step accidentally executed the code rather than just parsing it:
# BEFORE (vulnerable)
def validate_code(code_str: str) -> bool:
try:
exec(code_str) # This RUNS the code, not just validates it
return True
except SyntaxError:
return False
# AFTER (fixed)
def validate_code(code_str: str) -> bool:
try:
ast.parse(code_str) # This only PARSES, never executes
return True
except SyntaxError:
return False
Impact: CRITICAL. Any code generated by the LLM was executed during the “validation” step, before any quality or safety checks ran.
The Pattern
Both vulnerabilities share the same root cause: treating LLM output as trusted input. In traditional software, you sanitize user input. In AI systems, you must also sanitize AI output. Every code path that processes LLM-generated content must assume that content is adversarial — even if the LLM is your own agent running on your own infrastructure.
CI/CD Integration
All 8 gates run as part of our GitLab CI pipeline. The pipeline has 5 stages and 14 jobs, completing in approximately 3 minutes:
| Stage | Jobs | Gates | Blocks MR? |
|---|---|---|---|
| 1. Lint | ruff, mypy | — | Yes |
| 2. Security | bandit, detect-secrets, semgrep (x2), dangerous-funcs, shell-check | Gates 1–5 | Yes |
| 3. Test | pytest (platform), pytest (bridge), vitest (webapp) | — | Yes |
| 4. Scan | trivy (container), gitleaks (git history) | — | Yes (HIGH+) |
| 5. Deploy | deploy-staging, deploy-production | — | N/A |
The security stage runs in parallel: all 6 security jobs execute simultaneously, and the pipeline fails fast if any gate fails. This keeps the security overhead under 90 seconds.
Wiring It Into Your Pipeline
To adopt this security gate pattern in your own CI/CD:
- Start with Gates 1–2 (Bandit + detect-secrets). These catch the most common issues with minimal configuration.
- Add Gate 4 (eval/exec elimination) immediately if you process LLM output. This is your highest-impact gate for AI systems.
- Add Gate 3 (Semgrep) once you have custom patterns specific to your codebase. The auto rulesets are useful but noisy; custom rules are where the real value is.
- Add Gates 5–8 based on your architecture. WebSocket origin validation applies if you have WebSocket services. PBKDF2 applies if you store credentials. Constitutional AI applies if you have autonomous agents.
False Positive Management
The #1 reason security gates get disabled is false positive fatigue. Manage it proactively:
- Maintain an explicit allowlist/baseline file for each gate
- Require justification comments for every suppression
- Review suppressions quarterly — remove any that are no longer needed
- Track false positive rate as a metric (ours: 12% for Bandit, 8% for detect-secrets, 3% for custom Semgrep)
The Complete Checklist
Use this as a starting point for your own AI security audit. Items marked with * are AI-specific (not covered by traditional application security).
| # | Check | Tool | Severity |
|---|---|---|---|
| 1 | SAST scan passes with 0 HIGH findings | Bandit / Semgrep | HIGH |
| 2 | No secrets in repository (current + history) | detect-secrets + gitleaks | CRITICAL |
| 3 | No eval()/exec()/pickle on non-literal input * | Semgrep custom + grep | CRITICAL |
| 4 | All subprocess calls use shell=False | Semgrep custom | HIGH |
| 5 | WebSocket/API origin validation enabled | Manual review + test | HIGH |
| 6 | Passwords hashed with PBKDF2/bcrypt/Argon2 | Manual review | HIGH |
| 7 | LLM output treated as untrusted input * | Constitutional checker | CRITICAL |
| 8 | AI agent actions sandboxed (no host access) * | Process isolation | CRITICAL |
| 9 | Container images scanned for known CVEs | Trivy | MEDIUM |
| 10 | API keys rotated and stored in secret manager | GCP SM / AWS SM / Vault | HIGH |
| 11 | Prompt injection defenses in place * | Input validation + system prompts | MEDIUM |
| 12 | Audit log for all AI agent actions * | Structured logging | MEDIUM |
Findings Summary
Across 16 sprints of security hardening:
| Severity | Found | Fixed | Status |
|---|---|---|---|
| CRITICAL | 2 | 2 | Resolved |
| HIGH | 3 | 3 | Resolved |
| MEDIUM | 5 | 5 | Resolved |
| LOW | 8 | 6 | 2 accepted risk |
The 2 accepted-risk LOW findings are: (1) use of MD5 for non-security file checksums (deduplication, not integrity), and (2) HTTP (not HTTPS) for localhost-only inter-service communication. Both were triaged and documented with explicit risk acceptance.
The most dangerous assumption in AI security is that your own AI agent is trustworthy. It isn’t. Treat every output — code, commands, file operations — as potentially adversarial, even from your own models running on your own infrastructure.
What’s Next
Our security roadmap for Q2 2026:
- Prompt injection fuzzing: Automated test suite that attempts 200+ prompt injection patterns against every RPC endpoint
- Agent action rate limiting: Per-minute caps on file writes, network calls, and subprocess spawns by autonomous agents
- Output content filtering: Scan AI-generated content for PII leakage, credential patterns, and internal IP addresses before delivery to end users
- SBOM for AI dependencies: Track which models, versions, and configurations are used in production with automated drift detection
Get a Security Audit for Your AI Platform
We run the same 8-gate security analysis on client AI deployments. 2-week engagement, full report with prioritized findings and CI/CD integration templates.
Request a Security Audit Read: NS ArchitectureRelated Articles
- How Night Shift Runs 300+ Tasks Autonomously — architecture deep dive including constitutional merge gates
- Why Our AI Agent’s Quality Dropped 31% — root cause analysis with quality assessment calibration
- From 0 to 3,000 Tests — building quality and security into AI-generated code
- Autonomous AI Systems: The LivingCorp Paradigm — the operational model that requires these security gates
- NEXUS Platform — the AI-native platform with 5-tier security built in