gnmapper: Converting Nmap's Greppable Output to CSV with 50 Lines of Bash
Hook
While thousands of developers build complex Nmap parsers in Python, Ruby, and Go, some penetration testers swear by a 50-line bash script that does exactly one thing: convert gnmap to CSV in milliseconds without installing a single dependency.
Context
Nmap, the network scanner that powers security assessments worldwide, outputs data in three formats: normal (human-readable), XML (machine-parseable), and greppable (gnmap). The XML format gets all the attention—there are dozens of parsers for it in every language imaginable. But during live penetration tests, especially in restricted environments, the gnmap format shines. It's a line-oriented format designed specifically for Unix text processing tools like grep, awk, and sed.
The problem? While gnmap is perfect for quick command-line searches during reconnaissance, it's terrible for reporting. Security professionals need to pivot their findings into spreadsheets, databases, and ticketing systems—all of which expect CSV. You could write gnmap output to a file, manually grep for patterns, and copy-paste into Excel. Or you could install a full-featured Python parser with virtualenvs and pip dependencies on a minimal penetration testing distribution. gnmapper exists because sometimes you just need a shell script that runs anywhere bash exists, processes gigabytes of scan data in seconds, and outputs clean CSV without ceremony.
Technical Insight
gnmapper's architecture is deceptively simple: it's a bash wrapper around a carefully crafted gawk script that understands gnmap's line structure. Each line in gnmap format represents either a host status or port information, following a strict pattern that begins with a status indicator. The script exploits this predictability to extract fields using awk's field separator capabilities.
Here's how gnmap represents an open port:
Host: 192.168.1.100 () Ports: 22/open/tcp//ssh///, 80/open/tcp//http///, 443/open/tcp//https/// Ignored State: closed (997)
gnmapper transforms this into CSV rows where each port gets its own line:
IP,Port,Protocol,State,Service
192.168.1.100,22,tcp,open,ssh
192.168.1.100,80,tcp,open,http
192.168.1.100,443,tcp,open,https
The core parsing logic relies on gawk's ability to dynamically change field separators mid-processing. The script first splits on tabs to isolate the Ports section, then splits port entries on commas, and finally splits individual port details on forward slashes. This nested parsing happens in a single pass:
gawk 'BEGIN { FS="\t"; OFS="," }
/Ports:/ {
# Extract IP from "Host: 192.168.1.100 (hostname)"
split($1, host, " ")
ip = host[2]
# Extract ports section after "Ports: "
sub(/.*Ports: /, "", $2)
# Split multiple ports by comma-space
split($2, ports, ", ")
# Process each port: "22/open/tcp//ssh///"
for (i in ports) {
split(ports[i], details, "/")
# details[1]=port, details[2]=state, details[3]=protocol, details[5]=service
print ip, details[1], details[3], details[2], details[5]
}
}' "$@"
This approach demonstrates why gawk remains relevant for security tools: it processes text streams with minimal overhead, doesn't require loading entire files into memory, and runs on virtually every Linux system without installation. The script handles multiple input files by passing all arguments directly to gawk, which automatically processes them sequentially—a feature that makes it trivial to process entire directories of scan results.
The design choice to output to stdout rather than writing files directly is quintessentially Unix. It enables powerful pipelines during penetration tests:
# Process all scans from the last week
find ./scans -name "*.gnmap" -mtime -7 | xargs gnmapper > recent_findings.csv
# Extract only web services
gnmapper scan.gnmap | grep -E ",(80|443|8080|8443)," > web_services.csv
# Count services across multiple scans
gnmapper *.gnmap | cut -d, -f5 | sort | uniq -c | sort -rn
The limitation to gawk specifically (rather than POSIX awk) stems from the script's reliance on gawk extensions like the ability to use regex in split() functions and certain array handling behaviors. This is a pragmatic choice—gawk is ubiquitous on Linux systems where Nmap typically runs, and the additional features make the parsing code significantly more concise.
What gnmapper sacrifices in features, it gains in predictability. There's no configuration file to maintain, no version conflicts with system Python, no mystery about what it does. You can audit the entire codebase in five minutes, modify it in-place during an engagement, and trust that it will work the same way five years from now because bash and gawk are among the most stable tools in the Unix ecosystem.
Gotcha
The simplicity that makes gnmapper attractive is also its ceiling. The script provides no filtering options, no custom output formats, and no error handling beyond what bash provides by default. If you feed it malformed gnmap files from an interrupted Nmap scan, you'll get malformed CSV with no warning. There's no way to extract only certain fields or restructure the output without piping through additional tools or modifying the script directly.
More significantly, gnmapper only processes the Ports line in gnmap output. It completely ignores OS detection, script output, traceroute data, and timing information that Nmap can include in scans. If you ran Nmap with the -sV flag for version detection or -sC for script scanning, all that detailed service information is lost in the conversion. You get port number, state, protocol, and service name—nothing more. For comprehensive analysis, you're better off using XML output with a full-featured parser like libnmap or nmap-parse-output. The four-star GitHub rating reflects reality: this is a personal utility script that solves a specific problem, not a community-maintained project with roadmap and releases.
Verdict
Use if: You're conducting penetration tests on minimal Linux environments where installing Python dependencies is impractical or prohibited, you primarily need port/service CSV exports for reporting, you value auditability and want a tool you can fully understand in minutes, or you're building automated scanning pipelines that need lightweight, fast gnmap processing without external dependencies. Skip if: You need to parse Nmap XML for comprehensive data extraction including version detection and script output, you require cross-platform compatibility beyond Linux, you want built-in filtering or custom output formatting without writing additional shell pipelines, or you're building production security tools that need robust error handling and long-term community support—in those cases, invest in python-nmap or libnmap for a more maintainable solution.