CloudMapper: Analyzing AWS Environments Without Burning Through Your API Rate Limits
Hook
Most AWS security tools hammer your API endpoints every time you run an analysis. CloudMapper collects once, then lets you interrogate your infrastructure offline as many times as you want—no credentials required after initial collection.
Context
AWS environments grow organically. A startup begins with a single VPC and a handful of EC2 instances, then suddenly you're managing dozens of security groups, tangled IAM policies, and resources spread across multiple regions. Understanding what you actually have deployed becomes nearly impossible through the AWS Console alone.
Traditional cloud security tools solve this by continuously querying AWS APIs, but this approach has costs: API rate limiting, the need for persistent credentials, and difficulty sharing analysis results with team members who lack AWS access. CloudMapper, originally developed by Duo Security (now Cisco), takes a different approach: snapshot your AWS environment once, then analyze it exhaustively offline. This separation of concerns means security auditors can work without AWS credentials, developers can experiment with analysis scripts without triggering throttling errors, and compliance teams can archive point-in-time infrastructure states for later review.
Technical Insight
CloudMapper's architecture revolves around two distinct phases. The collection phase uses boto3 to execute AWS SDK describe and list operations across your accounts and regions, serializing the results into a local JSON database. The analysis phase operates entirely on these cached files, never touching AWS APIs. This design choice enables a workflow that's fundamentally different from real-time monitoring tools.
The collection process is straightforward but comprehensive. After configuring your AWS accounts in config.json, CloudMapper uses the standard AWS credential chain (environment variables, credential files, or IAM roles) to enumerate resources:
# Configure your AWS accounts
python cloudmapper.py configure add-account --config-file config.json \
--name production --id 123456789012 --default us-east-1
# Collect data (this hits AWS APIs)
python cloudmapper.py collect --account production
# Data is now stored locally in account-data/production.json
# All subsequent analysis runs offline
python cloudmapper.py report --account production
python cloudmapper.py public --account production
python cloudmapper.py iam_report --account production
The collected JSON structure mirrors AWS API responses, which makes the data immediately familiar to anyone who's worked with boto3. For instance, EC2 instance data maintains the same nested structure you'd get from describe_instances(). This design choice means you can write custom analysis scripts using standard JSON parsing libraries without learning a proprietary schema.
The network visualization component showcases the offline-first philosophy. Rather than querying AWS in real-time to build a topology graph, CloudMapper processes the cached JSON to construct a D3.js force-directed graph. The prepare command transforms raw AWS data into a format suitable for web rendering:
# CloudMapper internally does something conceptually similar to:
import json
def build_network_graph(account_data):
nodes = []
edges = []
# Process VPCs as top-level containers
for vpc in account_data['vpcs']:
nodes.append({
'id': vpc['VpcId'],
'type': 'vpc',
'cidr': vpc['CidrBlock']
})
# Map subnets to VPCs
for subnet in vpc['subnets']:
nodes.append({
'id': subnet['SubnetId'],
'type': 'subnet',
'az': subnet['AvailabilityZone']
})
edges.append({
'source': vpc['VpcId'],
'target': subnet['SubnetId']
})
return {'nodes': nodes, 'edges': edges}
The IAM analysis commands demonstrate CloudMapper's security-focused capabilities. The find_admins command walks through IAM policies (both inline and managed) to identify principals with administrative access, while stats generates metrics about your security posture. These analyses become particularly powerful when combined with the offline model—you can run experimental queries, refine your detection logic, and re-run analyses instantly without worrying about API costs or rate limits.
CloudMapper also supports a plugin architecture for custom analysis commands. By dropping Python files into the private_commands directory, teams can encode organization-specific compliance checks. For example, you might write a command that verifies all production EC2 instances have required tags, or that S3 buckets follow naming conventions. Since these scripts operate on cached data, you can iterate on them rapidly during development.
The HTML report generation is particularly valuable for compliance workflows. The report command produces a comprehensive static website documenting security groups, public resources, IAM issues, and more. Because it's static HTML, you can archive these reports with timestamps, email them to stakeholders without AWS access, or check them into version control as infrastructure documentation. This addresses a real pain point: demonstrating compliance often requires generating artifacts that auditors can review independently.
Gotcha
CloudMapper's snapshot-based approach is both its greatest strength and its fundamental limitation. The tool operates on point-in-time data, which means it cannot detect changes happening in real-time. If someone spins up a publicly accessible RDS instance five minutes after you run collect, CloudMapper won't know about it until you collect again. This makes it unsuitable as a continuous monitoring solution or for alerting on security events as they occur. You're essentially trading real-time awareness for analytical flexibility.
The installation experience can be frustrating, particularly on macOS. CloudMapper depends on pyjq (Python bindings for jq), which requires compiling C extensions. This means installing autoconf, automake, and libtool via Homebrew before pip install succeeds. On Linux systems, you'll need equivalent development packages. The documentation mentions these requirements, but developers accustomed to pure-Python tools may be surprised by the compilation step. Additionally, the network visualization's reliance on a local web server (webserver command) can complicate deployment in environments where running arbitrary web services violates security policies. For extremely large AWS environments with thousands of resources, the D3.js visualization can become sluggish or even crash browser tabs, though the HTML reports remain functional.
Verdict
Use CloudMapper if you need periodic security audits of AWS environments, want to generate point-in-time compliance reports, or have team members who need to analyze infrastructure without AWS credentials. It excels when you're investigating security issues and need to run multiple experimental queries against the same data snapshot, or when you're documenting infrastructure state for auditors. The offline-first design makes it ideal for consulting scenarios where you collect client data once and analyze it back at your office. Skip it if you need real-time monitoring and alerting—use AWS Config, GuardDuty, or Security Hub instead. Also skip it if your organization prohibits local data storage of AWS metadata (some compliance frameworks require analysis to happen entirely within the cloud provider's boundary), or if you're managing such massive multi-account environments that the collection process itself becomes a maintenance burden. And critically: before investing effort, verify you're using the official duo-labs/cloudmapper repository rather than this 12-star fork, as the upstream version receives active maintenance and has a larger community for support.