AI Security Enterprise Checklist — March 2026 | 11 min read

Enterprise AI Security Checklist for 2026 — 8 Gates You Need

We hardened our AI platform across 16 sprints and found 2 critical RCE vulnerabilities that would have allowed arbitrary code execution on our production server. AI systems introduce attack vectors that traditional security tooling doesn’t catch: eval() in LLM output handlers, unsandboxed exec() in code generation pipelines, and shell injection through agent tool calls. Here are the 8 security gates we deployed — and how to wire them into your CI/CD pipeline.

Most enterprise AI security guides focus on prompt injection and data poisoning. Those are real risks. But the vulnerabilities we found were more mundane and more dangerous: a python3 -c call that let the AI agent execute arbitrary Python on the host, and an exec() call in the code generation pipeline that ran LLM output without sandboxing. Both were CRITICAL severity — full remote code execution with the privileges of the service account.

16 Security Sprints

2 Critical RCE Vulns

8 Security Gates

609 Security Tests Added

The 8 Security Gates

These gates run in our GitLab CI pipeline on every commit. A failure in any gate blocks the merge request. The total pipeline runs in approximately 3 minutes across 14 jobs and 5 stages.

Gate 1: Bandit SAST

What it catches: Python security anti-patterns — hardcoded passwords, insecure hash functions, SQL injection, dangerous imports, assert statements in production code.

Configuration: Blocks on HIGH severity. MEDIUM findings are logged but don’t block. We maintain a false-positive filter in CI to suppress known-good patterns (e.g., hashlib.md5 used for non-security checksums).

# .gitlab-ci.yml — Bandit SAST gate
bandit_sast:
  stage: security
  script:
    - pip install bandit
    - bandit -r platform/ projects/ -f json -o bandit-report.json
           --severity-level high --confidence-level medium
    - python ci/filter_bandit_fps.py bandit-report.json
  allow_failure: false
  artifacts:
    reports:
      sast: bandit-report.json

Real Finding

Bandit flagged subprocess.call(cmd, shell=True) in a deployment script. The cmd variable included user-provided branch names without sanitization. An attacker could craft a branch name like main; rm -rf / to execute arbitrary commands during deployment.

Gate 2: detect-secrets

What it catches: API keys, tokens, passwords, private keys accidentally committed to the repository. Scans all staged files for patterns matching 20+ secret types (AWS, GCP, Slack, Stripe, generic high-entropy strings).

# .gitlab-ci.yml — Secrets scanning gate
detect_secrets:
  stage: security
  script:
    - pip install detect-secrets
    - detect-secrets scan --all-files --exclude-files '\.lock$'
           --baseline .secrets.baseline
    - detect-secrets audit --report --json .secrets.baseline
  allow_failure: false

We maintain a .secrets.baseline file that marks known false positives (e.g., example API keys in documentation, test fixtures with dummy tokens). Every new detection requires explicit triage: either fix it or add it to the baseline with a justification comment.

Gate 3: Semgrep (Auto + Custom Rules)

What it catches: Language-specific vulnerabilities, insecure patterns, and custom rules tailored to our codebase. Semgrep runs twice: once with the p/python auto ruleset, once with our custom rules.

Custom rules we wrote:

no-eval-exec: Blocks any use of eval(), exec(), or compile() on non-literal strings
no-pickle-loads: Blocks pickle.loads() and pickle.load() (deserialization RCE vector)
no-shell-true: Blocks subprocess.*(shell=True) except in explicitly allowlisted files
websocket-origin-check: Requires origin validation in WebSocket handlers

# .semgrep/custom-rules.yml
rules:
  - id: no-eval-exec
    patterns:
      - pattern-either:
          - pattern: eval(...)
          - pattern: exec(...)
          - pattern: compile(..., ..., "exec")
    message: "eval/exec/compile detected — RCE risk. Use ast.literal_eval() or structured parsing."
    severity: ERROR
    languages: [python]

Gate 4: eval()/exec()/pickle Elimination

Why a dedicated gate? This is the #1 attack vector in AI systems. LLMs generate code. If that code passes through eval() or exec() before validation, the LLM’s output becomes executable code on your server. This is not a theoretical risk — we found it in production.

The gate is a simple grep + allowlist check that runs independently of Semgrep (defense in depth):

# ci/check_dangerous_functions.py
DANGEROUS = ['eval(', 'exec(', 'pickle.loads(', 'pickle.load(',
             '__import__(', 'compile(']
ALLOWLIST = ['platform/tests/', 'docs/examples/']

for file in changed_files:
    if any(file.startswith(a) for a in ALLOWLIST):
        continue
    content = open(file).read()
    for func in DANGEROUS:
        if func in content:
            # Check if it's in a comment or string literal
            if not is_in_comment_or_string(content, func):
                fail(f"BLOCKED: {func} in {file}")

Gate 5: Shell Injection Prevention

What it enforces: All subprocess calls must use shell=False (the default) with argument lists, not shell strings. This prevents command injection through unsanitized inputs.

Pattern	Status	Risk
`subprocess.run(["git", "pull"], shell=False)`	ALLOWED	None — arguments are list elements
`subprocess.run(f"git pull {branch}", shell=True)`	BLOCKED	Shell injection via branch name
`os.system(cmd)`	BLOCKED	Always uses shell, no argument separation
`os.popen(cmd)`	BLOCKED	Shell execution, deprecated

Gate 6: WebSocket Origin Validation

What it prevents: Cross-site WebSocket hijacking (CSWSH). Without origin validation, any website can open a WebSocket connection to your AI platform and issue commands as the authenticated user.

Our NEXUS Bridge server (458 RPC methods, port 9800) validates the Origin header on every WebSocket upgrade request:

# bridge/server.py — Origin validation
ALLOWED_ORIGINS = [
    "https://nexus.zeltrex.com",
    "https://tab.zeltrex.com",
    "https://phone.zeltrex.com",
    "http://localhost:3333",     # Development only
]

async def on_connect(self, websocket):
    origin = websocket.request_headers.get("Origin", "")
    if origin and origin not in ALLOWED_ORIGINS:
        await websocket.close(4003, "Origin not allowed")
        logger.warning(f"Rejected WebSocket from origin: {origin}")
        return

Gate 7: PBKDF2 Authentication

What it replaces: Plaintext password storage and comparison. Our initial prototype stored API credentials as plaintext in config files. The security audit caught this and we migrated to PBKDF2-HMAC-SHA256 with 600,000 iterations.

Where it applies: Bridge server authentication, admin panel access, inter-service tokens. All secrets are now stored in GCP Secret Manager (44 keys across 7 entity bundles) with local fallback to encrypted files.

# platform/auth/password.py
import hashlib, os

def hash_password(password: str, salt: bytes = None) -> tuple[str, str]:
    salt = salt or os.urandom(32)
    key = hashlib.pbkdf2_hmac('sha256', password.encode(),
                               salt, iterations=600_000)
    return salt.hex(), key.hex()

def verify_password(password: str, salt_hex: str, key_hex: str) -> bool:
    _, computed = hash_password(password, bytes.fromhex(salt_hex))
    return computed == key_hex

Gate 8: Constitutional AI Checker (Night Shift Merge Gate)

What it does: Before any Night Shift output is merged into the codebase, a constitutional checker validates the output against safety rules. This is the last line of defense against an autonomous AI agent producing malicious or dangerous code.

The constitutional checker enforces:

No dangerous function calls (eval, exec, os.system, subprocess with shell=True)
No network calls to unknown hosts (allowlist of approved domains)
No file system operations outside the project directory
No credential access patterns (reading from .secrets/, environment variables matching secret patterns)
Quality score ≥ 4.0/10 from the hybrid assessor

Why Constitutional AI for Code?

Traditional SAST tools scan code written by humans. Constitutional checkers scan code written by AI. The difference matters: an AI agent generating code might produce syntactically correct, functionally useful code that also includes a backdoor, exfiltration channel, or privilege escalation. The constitutional checker specifically targets AI-generated attack patterns that SAST tools don’t model.

The 2 Critical Vulnerabilities We Found

CVE-Class 1: RCE via python3 -c in Agent Executor

Night Shift’s task executor had a code path that constructed Python commands dynamically and executed them via subprocess.run(f"python3 -c '{code}'", shell=True). The code variable came from the LLM’s output — meaning the AI agent could execute arbitrary Python on the host server.

Impact: CRITICAL. Full RCE with the service account’s privileges (which included read/write to the entire codebase and access to GCP Secret Manager).

Fix: Replaced python3 -c with a structured execution sandbox that:

Parses the LLM output as an AST (Abstract Syntax Tree) before execution
Blocks any AST nodes containing Import, Call to dangerous functions, or Attribute access to os/subprocess/sys
Runs in a subprocess with reduced privileges (no network, no filesystem access outside temp directory)

CVE-Class 2: Unsandboxed exec() in Code Generation Pipeline

The code generation pipeline used exec() to validate that LLM-generated code was syntactically correct. The validation step accidentally executed the code rather than just parsing it:

# BEFORE (vulnerable)
def validate_code(code_str: str) -> bool:
    try:
        exec(code_str)  # This RUNS the code, not just validates it
        return True
    except SyntaxError:
        return False

# AFTER (fixed)
def validate_code(code_str: str) -> bool:
    try:
        ast.parse(code_str)  # This only PARSES, never executes
        return True
    except SyntaxError:
        return False

Impact: CRITICAL. Any code generated by the LLM was executed during the “validation” step, before any quality or safety checks ran.

The Pattern

Both vulnerabilities share the same root cause: treating LLM output as trusted input. In traditional software, you sanitize user input. In AI systems, you must also sanitize AI output. Every code path that processes LLM-generated content must assume that content is adversarial — even if the LLM is your own agent running on your own infrastructure.

CI/CD Integration

All 8 gates run as part of our GitLab CI pipeline. The pipeline has 5 stages and 14 jobs, completing in approximately 3 minutes:

Stage	Jobs	Gates	Blocks MR?
1. Lint	ruff, mypy	—	Yes
2. Security	bandit, detect-secrets, semgrep (x2), dangerous-funcs, shell-check	Gates 1–5	Yes
3. Test	pytest (platform), pytest (bridge), vitest (webapp)	—	Yes
4. Scan	trivy (container), gitleaks (git history)	—	Yes (HIGH+)
5. Deploy	deploy-staging, deploy-production	—	N/A

The security stage runs in parallel: all 6 security jobs execute simultaneously, and the pipeline fails fast if any gate fails. This keeps the security overhead under 90 seconds.

Wiring It Into Your Pipeline

To adopt this security gate pattern in your own CI/CD:

Start with Gates 1–2 (Bandit + detect-secrets). These catch the most common issues with minimal configuration.
Add Gate 4 (eval/exec elimination) immediately if you process LLM output. This is your highest-impact gate for AI systems.
Add Gate 3 (Semgrep) once you have custom patterns specific to your codebase. The auto rulesets are useful but noisy; custom rules are where the real value is.
Add Gates 5–8 based on your architecture. WebSocket origin validation applies if you have WebSocket services. PBKDF2 applies if you store credentials. Constitutional AI applies if you have autonomous agents.

False Positive Management

The #1 reason security gates get disabled is false positive fatigue. Manage it proactively:

Maintain an explicit allowlist/baseline file for each gate
Require justification comments for every suppression
Review suppressions quarterly — remove any that are no longer needed
Track false positive rate as a metric (ours: 12% for Bandit, 8% for detect-secrets, 3% for custom Semgrep)

The Complete Checklist

Use this as a starting point for your own AI security audit. Items marked with * are AI-specific (not covered by traditional application security).

#	Check	Tool	Severity
1	SAST scan passes with 0 HIGH findings	Bandit / Semgrep	HIGH
2	No secrets in repository (current + history)	detect-secrets + gitleaks	CRITICAL
3	No eval()/exec()/pickle on non-literal input *	Semgrep custom + grep	CRITICAL
4	All subprocess calls use shell=False	Semgrep custom	HIGH
5	WebSocket/API origin validation enabled	Manual review + test	HIGH
6	Passwords hashed with PBKDF2/bcrypt/Argon2	Manual review	HIGH
7	LLM output treated as untrusted input *	Constitutional checker	CRITICAL
8	AI agent actions sandboxed (no host access) *	Process isolation	CRITICAL
9	Container images scanned for known CVEs	Trivy	MEDIUM
10	API keys rotated and stored in secret manager	GCP SM / AWS SM / Vault	HIGH
11	Prompt injection defenses in place *	Input validation + system prompts	MEDIUM
12	Audit log for all AI agent actions *	Structured logging	MEDIUM

Findings Summary

Across 16 sprints of security hardening:

Severity	Found	Fixed	Status
CRITICAL	2	2	Resolved
HIGH	3	3	Resolved
MEDIUM	5	5	Resolved
LOW	8	6	2 accepted risk

The 2 accepted-risk LOW findings are: (1) use of MD5 for non-security file checksums (deduplication, not integrity), and (2) HTTP (not HTTPS) for localhost-only inter-service communication. Both were triaged and documented with explicit risk acceptance.

The most dangerous assumption in AI security is that your own AI agent is trustworthy. It isn’t. Treat every output — code, commands, file operations — as potentially adversarial, even from your own models running on your own infrastructure.

What’s Next

Our security roadmap for Q2 2026:

Prompt injection fuzzing: Automated test suite that attempts 200+ prompt injection patterns against every RPC endpoint
Agent action rate limiting: Per-minute caps on file writes, network calls, and subprocess spawns by autonomous agents
Output content filtering: Scan AI-generated content for PII leakage, credential patterns, and internal IP addresses before delivery to end users
SBOM for AI dependencies: Track which models, versions, and configurations are used in production with automated drift detection

Get a Security Audit for Your AI Platform

We run the same 8-gate security analysis on client AI deployments. 2-week engagement, full report with prioritized findings and CI/CD integration templates.

Request a Security Audit Read: NS Architecture

How Night Shift Runs 300+ Tasks Autonomously — architecture deep dive including constitutional merge gates
Why Our AI Agent’s Quality Dropped 31% — root cause analysis with quality assessment calibration
From 0 to 3,000 Tests — building quality and security into AI-generated code
Autonomous AI Systems: The LivingCorp Paradigm — the operational model that requires these security gates
NEXUS Platform — the AI-native platform with 5-tier security built in