cidrmerge: The 50-Line Python Script That Cleans Up Network ACLs
Hook
A single misconfigured firewall rule with overlapping CIDR blocks can multiply your attack surface—and most network admins won't even notice until an audit flags hundreds of redundant ACL entries.
Context
Network configurations accumulate cruft over time. As teams add firewall rules, route tables, and access control lists, CIDR ranges overlap and duplicate. You might allowlist 10.0.0.0/24, then later add 10.0.1.0/24, and eventually someone adds 10.0.0.0/22 which encompasses both. Now you're maintaining three rules where one would suffice. This isn't just aesthetic—cloud providers charge per firewall rule in some configurations, auditing becomes painful, and performance degrades as rule evaluation slows with list size.
Traditional solutions involve manual calculation or heavyweight libraries. Network engineers often resort to spreadsheets or online calculators, copying and pasting blocks one at a time. Developers reach for comprehensive libraries like Python's netaddr, but then need to write wrapper scripts for basic consolidation tasks. cidrmerge fills the gap as a purpose-built Unix-style tool: read CIDR blocks from stdin, output the minimal equivalent set to stdout. No configuration files, no GUI, no ceremony—just efficient subnet aggregation that slots into existing automation pipelines.
Technical Insight
At its core, cidrmerge implements subnet aggregation through binary prefix comparison. The algorithm converts each CIDR block to its binary network representation, sorts them numerically, then walks through the sorted list identifying mergeable ranges. Two CIDR blocks merge if one fully contains the other or if they're adjacent networks that can combine into a larger supernet.
Here's a practical example of how cidrmerge handles a realistic scenario:
# Create a messy list of overlapping ranges
cat << EOF | python -m cidrmerge
192.168.1.0/24
192.168.2.0/24
192.168.0.0/22
10.0.0.0/16
10.0.1.0/24
172.16.5.32/27
172.16.5.64/27
EOF
# Output:
# 10.0.0.0/16
# 172.16.5.32/27
# 172.16.5.64/27
# 192.168.0.0/22
Notice how 192.168.1.0/24 and 192.168.2.0/24 disappeared—they're subsumed by 192.168.0.0/22, which covers 192.168.0.0 through 192.168.3.255. Similarly, 10.0.1.0/24 vanished into the larger 10.0.0.0/16 block. The /27 ranges couldn't merge because they're not adjacent (there's a gap between .63 and .64 at the /27 boundary).
The tool's strict validation is architecturally significant. Many CIDR tools accept "192.168.1.5/24" and silently normalize it to "192.168.1.0/24". cidrmerge rejects this outright, requiring that network addresses have all host bits zeroed according to the subnet mask. This design choice prevents a common source of configuration errors where someone copies a host IP from a ping command and adds a /24 suffix, thinking they've specified the correct network. In automation contexts, this fail-fast behavior catches upstream data quality issues before they propagate into production network configs.
The implementation strategy prioritizes correctness over performance. Rather than implementing complex interval trees or tries, cidrmerge uses Python's built-in ipaddress module for parsing and validation, then relies on sorting and linear scanning. For most real-world use cases—processing hundreds or even thousands of ACL entries—this completes in milliseconds. The simplicity means the codebase remains auditable, with logic that network engineers can verify without deep computer science knowledge.
Integration into DevOps workflows is straightforward. Consider a Terraform workflow that generates security group rules from multiple sources:
#!/bin/bash
# Collect CIDR blocks from various sources
cat terraform-state.json | jq -r '.resources[].instances[].attributes.cidr_blocks[]' > /tmp/cidrs.txt
cat approved-ranges.txt >> /tmp/cidrs.txt
cat vendor-ips.txt >> /tmp/cidrs.txt
# Deduplicate and merge
cat /tmp/cidrs.txt | sort -u | python -m cidrmerge > consolidated-ranges.txt
# Generate new Terraform variables
echo 'variable "allowed_ranges" {' > ranges.auto.tfvars
echo ' type = list(string)' >> ranges.auto.tfvars
echo ' default = [' >> ranges.auto.tfvars
cat consolidated-ranges.txt | awk '{print " \"" $1 "\","}' >> ranges.auto.tfvars
echo ' ]' >> ranges.auto.tfvars
echo '}' >> ranges.auto.tfvars
This pipeline prevents the common anti-pattern of security groups with 50+ redundant rules when 10 would suffice. It also makes drift detection easier—when you can represent allowlisted ranges in their canonical minimal form, comparing current state to desired state becomes a simple text diff rather than a complex subset calculation.
The tool's stdin/stdout design enables composition with other Unix utilities. Combining it with grep for filtering, comm for set operations, or diff for change detection creates powerful network configuration workflows without writing complex scripts. This architectural philosophy—doing one thing well and integrating through standard streams—makes cidrmerge more versatile than monolithic network management suites.
Gotcha
The IPv4-only limitation is the most immediate constraint. As organizations adopt IPv6, often running dual-stack configurations, you'll need separate tooling for IPv6 ranges. There's no technical reason the same aggregation algorithm couldn't handle IPv6—it's the same binary prefix logic with longer addresses—but cidrmerge simply doesn't implement it. If your infrastructure spans both IP versions, you'll pipeline IPv4 through cidrmerge and route IPv6 through aggregate6 or write custom scripts with the ipaddress library.
The strict validation, while valuable for catching errors, becomes frustrating when working with dirty data sources. If you're consuming CIDR lists from external vendors or legacy systems that occasionally include host addresses with CIDR notation, cidrmerge will halt on the first invalid entry. You can't tell it to skip malformed lines or auto-correct them—it's all or nothing. Workarounds involve pre-processing with sed or awk to normalize inputs, but this adds complexity and the risk of silently accepting incorrect ranges. A "--strict" flag with a permissive default mode would better serve diverse use cases.
There's also no option to preserve specific subnet boundaries for policy reasons. Sometimes you want to merge adjacent ranges except when they cross organizational boundaries (dev vs. prod networks) or geographic regions. cidrmerge gives you the mathematical minimum, but real-world network policies often require maintaining certain splits even when aggregation would be technically possible. You'd need to partition your input, run cidrmerge on each partition separately, then combine outputs—manageable but not elegant.
Verdict
Use if: You need reliable IPv4 CIDR consolidation in shell scripts, CI/CD pipelines, or network automation where strict validation catches configuration mistakes before they reach production. Its simplicity is a feature when you want predictable behavior without configuration complexity, and the Unix-style interface integrates cleanly into existing toolchains. Perfect for cleaning up firewall rules, optimizing cloud security groups, or maintaining canonical representations of IP allowlists. Skip if: You're working with IPv6, need flexible error handling for dirty data sources, or require policy-aware merging that preserves certain subnet boundaries. In those cases, use netaddr programmatically for full control, or aggregate6 if you need both IP versions with similar CLI simplicity.