Bandit: How OpenStack's AST-Powered Security Scanner Catches Python Vulnerabilities Before Deployment

Hook

A single hardcoded password in a Python codebase took down a major cloud provider for 6 hours in 2014. OpenStack's Security Group built Bandit to ensure it would never happen again—by analyzing your code's structure, not just its text.

Context

Before AST-based security tools, Python developers relied on grep patterns and regular expressions to find security issues—an approach that fails spectacularly with Python's dynamic nature. You could search for "password =" but miss getattr(config, 'pass' + 'word'). You could flag eval() calls but not understand whether the input was sanitized. OpenStack, managing millions of lines of Python across hundreds of repositories powering critical cloud infrastructure, needed something smarter.

The OpenStack Security Group created Bandit in 2014 as a response to recurring security incidents in their massive Python ecosystem. Traditional linters like Pylint caught bugs, but security vulnerabilities require domain-specific knowledge—understanding that pickle.loads() on untrusted data enables remote code execution, or that random.random() shouldn't generate authentication tokens. Bandit was designed to embed this security expertise into an automated tool that understood Python's semantic structure, not just its syntax. It quickly became the de facto security scanner for Python projects, eventually moving to the Python Code Quality Authority (PyCQA) organization where it's actively maintained today.

Technical Insight

System architecture — auto-generated

Bandit's power comes from analyzing Python's Abstract Syntax Tree rather than raw text. When you run Bandit, it uses Python's built-in ast module to parse source files into a tree structure representing the code's semantic meaning. Each node in this tree—imports, function calls, assignments—gets examined by specialized plugins that understand security implications.

Here's a concrete example of what Bandit catches that regex-based tools miss:

# Code that looks innocent to a regex scanner
import subprocess

def deploy_service(service_name):
    # Vulnerable: shell injection via user input
    subprocess.call(f"systemctl restart {service_name}", shell=True)
    
    # Also vulnerable: different syntax, same problem
    cmd = "systemctl restart " + service_name
    subprocess.call(cmd, shell=True)
    
    # Safe: proper argument passing
    subprocess.call(["systemctl", "restart", service_name])

Bandit's subprocess_popen_with_shell_equals_true plugin examines the AST nodes for subprocess calls, identifies when shell=True is set, and checks if the command includes string interpolation or concatenation with variables. It reports this as a high-severity issue because any user-controlled service_name could inject commands like "nginx; rm -rf /".

The plugin architecture is elegant and extensible. Each security check is a Python function decorated with metadata:

import bandit
from bandit.core import test_properties as test

@test.checks('Call')
@test.test_id('B602')
def subprocess_popen_with_shell_equals_true(context):
    if context.call_function_name_qual in [
        'subprocess.Popen', 'subprocess.call'
    ]:
        if context.check_call_arg_value('shell', 'True'):
            if context.is_shell_command_with_interpolation():
                return bandit.Issue(
                    severity=bandit.HIGH,
                    confidence=bandit.HIGH,
                    text="subprocess call with shell=True and user input detected"
                )

The context object provides high-level methods to interrogate the AST node—what function is being called, what arguments are passed, whether variables are involved. This abstraction means plugin authors don't need to manually traverse AST nodes; they describe security patterns declaratively.

Bandit organizes its checks into severity (LOW, MEDIUM, HIGH) and confidence (LOW, MEDIUM, HIGH) levels. A check for assert statements in production code is LOW severity (performance impact, not security breach), while eval() on user input is HIGH severity. Confidence reflects how certain Bandit is that the issue is exploitable—hardcoded passwords have HIGH confidence, but flagging all uses of pickle is MEDIUM confidence since some uses might be safe.

The configuration system lets you tune these levels for your project. A .bandit YAML file can exclude specific tests, skip certain directories, or adjust severity thresholds:

exclude_dirs:
  - /test
  - /venv
skips:
  - B101  # Skip assert_used check for projects using pytest
tests:
  - B201  # Only run the flask_debug_true check

Integration with CI/CD is straightforward. Bandit exits with status code 1 when it finds issues above your configured threshold, failing builds automatically:

# Fail build on any HIGH severity issues
bandit -r . -ll

# Generate JSON report for dashboard tools
bandit -r . -f json -o bandit-report.json

The JSON output is particularly valuable for security dashboards and tracking remediation over time. Each issue includes the filename, line number, severity, confidence, and a CWE (Common Weakness Enumeration) identifier linking to detailed vulnerability documentation.

Gotcha

The elephant in the room: this repository is archived. The OpenStack Security Group transferred Bandit to PyCQA in 2016, and this openstack-archive/bandit repository hasn't received updates since then. Using this version means missing years of security improvements, new vulnerability checks, and Python 3.10+ compatibility. You must use PyCQA/bandit from pip (pip install bandit) or GitHub instead.

Even the actively maintained version has fundamental limitations inherent to static analysis. Bandit cannot follow data flow across module boundaries—if you sanitize user input in one module and use it in another, Bandit analyzes each file independently and may flag the usage as vulnerable. It struggles with dynamic Python features like getattr(), __import__(), or decorator-modified functions where the actual behavior isn't clear from the AST alone. False positives are common enough that teams often spend time tuning exclusions rather than fixing issues. Type hints help but aren't required by Bandit, and without them, it can't always determine if a variable contains user input or a safe constant. Finally, Bandit only scans Python files you explicitly point it at—security issues in dependencies, configuration files, or infrastructure code require different tools.

Verdict

Use if: You're building any Python application that handles user input, processes sensitive data, or runs in production. Bandit should be mandatory in CI/CD pipelines alongside unit tests—it's free insurance against common security mistakes like SQL injection, command injection, and cryptographic weaknesses. It's especially critical for web applications (Django, Flask, FastAPI), APIs, and data processing pipelines where vulnerabilities have direct business impact. Use it even if you have senior developers; security expertise doesn't eliminate human error during late-night debugging sessions. Skip if: You're writing throwaway scripts, pure data science notebooks with no external input, or internal tooling that never processes untrusted data. Also skip if you need comprehensive security testing—Bandit is one layer in defense-in-depth, not a complete solution. It won't catch authorization logic bugs, runtime configuration errors, or dependency vulnerabilities (use Safety or Snyk for those). And absolutely skip the archived OpenStack repository—install from PyCQA instead.

Bandit: How OpenStack's AST-Powered Security Scanner Catches Python Vulnerabilities Before Deployment

Bandit: How OpenStack's AST-Powered Security Scanner Catches Python Vulnerabilities Before Deployment

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Bandit: How OpenStack's AST-Powered Security Scanner Catches Python Vulnerabilities Before Deployment

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Caldera: When Your Red Team Needs a Planning Algorithm, Not Just Another C2

Caldera: Building Adversary Emulation with Fact-Based Planning Engines

Inside Mathias Bynens' Dotfiles: The Blueprint for 30,000 macOS Developer Environments

Glow: Why Rendering Markdown in the Terminal Shouldn't Require a Browser

Caldera: When Your Red Team Needs a Planning Algorithm, Not Just Another C2

Caldera: Building Adversary Emulation with Fact-Based Planning Engines

Inside Mathias Bynens' Dotfiles: The Blueprint for 30,000 macOS Developer Environments

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]