WhatWeb: Ruby-Powered Web Fingerprinting with 1800+ Plugins for Security Reconnaissance
Hook
While most scanners check if a site runs WordPress, WhatWeb examines 15 different signals—from favicon hashes to relative link structures—to tell you which plugins, themes, and exact version are running.
Context
Web application security assessments historically relied on manual inspection and basic banner grabbing. A penetration tester would visit a site, view the source, check headers, and piece together the technology stack through educated guesses. This approach was time-consuming, inconsistent, and missed subtle indicators that could reveal critical vulnerabilities.
WhatWeb emerged as a solution to systematize and scale this reconnaissance process. Rather than memorizing the subtle differences between Drupal 7 and Drupal 8's HTML output, security professionals needed a tool that encoded this knowledge into repeatable tests. The challenge wasn't just identifying obvious technologies—meta tags announcing 'powered by WordPress'—but detecting systems actively hiding their fingerprints, recognizing outdated frameworks with known CVEs, and mapping the complete technology surface of a target. WhatWeb's plugin architecture turned tribal knowledge into executable code, transforming reconnaissance from an art into a systematic, reproducible process.
Technical Insight
WhatWeb's power comes from its multi-layered detection strategy implemented through a flexible plugin system. Each plugin represents a specific technology and can contain multiple detection methods ranging from passive observation to active probing. The aggression level system (1-4) controls how invasive these tests become, letting you balance thoroughness against visibility.
At aggression level 1, WhatWeb operates in pure stealth mode, making a single HTTP request and analyzing only the response headers, body, and cookies. This is perfect for OSINT work where you can't afford to leave traces. Level 2 adds additional standard requests (like checking for common files). Level 3 gets aggressive by testing for specific paths that confirm technology presence, while level 4 performs intensive probing including version-specific file checks.
The plugin architecture itself is elegant Ruby code. Here's a simplified example of how a detection plugin works:
Plugin.define do
name "WordPress"
author "Andrew Horton"
version "0.5"
# Passive checks from aggression level 1
passive do
# Check meta generator tag
m << { name: "Meta Generator" } if @body =~ /<meta name=["']generator["'] content=["']WordPress ([\d.]+)["']/i
# Check for wp-content directory references
m << { name: "WP-Content" } if @body =~ /\/wp-content\//i
# Check HTTP headers
m << { name: "X-Powered-By" } if @headers["x-powered-by"] =~ /WordPress/i
end
# Aggressive checks requiring additional requests
aggressive do
# Check for WordPress login page
path = @base_uri.merge("/wp-login.php")
response = fetch(path)
m << { name: "WP-Login", certainty: 100 } if response.body =~ /wordpress/i
# Check for readme.html with version
readme = fetch(@base_uri.merge("/readme.html"))
if readme.body =~ /WordPress ([\d.]+)/
m << { name: "Readme", version: $1, certainty: 100 }
end
end
end
This multi-test approach provides fuzzy matching with certainty scoring. Finding 'wp-content' in the HTML gives moderate confidence, but finding the login page at the expected path provides near-certainty. The version extraction prioritizes explicit declarations (meta tags, readme files) over inference.
WhatWeb's output system demonstrates production-ready thinking. Rather than just printing results to stdout, it supports structured formats that integrate into security workflows:
# JSON output for parsing by other tools
whatweb --log-json=results.json target.com
# MongoDB output for aggregating scan data
whatweb --log-mongo=mongodb://localhost/whatweb target.com
# SQL output for querying across campaigns
whatweb --log-sql=whatweb.sql target.com
The concurrency model uses Ruby threads with configurable limits, though there's an interesting performance consideration: cookie handling becomes a bottleneck at high thread counts. WhatWeb includes a --no-errors flag that disables cookie processing, trading some detection accuracy for significant speed improvements on large-scale scans.
One sophisticated feature is the URL certainty system. When you provide a bare hostname like 'example.com', WhatWeb intelligently tests both HTTP and HTTPS, then continues with whichever protocol responds successfully. This seemingly simple feature eliminates a common reconnaissance friction point—manually determining which protocol to use.
The plugin system's real power emerges in complex scenarios. For embedded devices, plugins check for default admin interfaces at specific paths, test for characteristic error messages, and even analyze favicon hashes—a surprisingly reliable fingerprint since manufacturers rarely customize these. For JavaScript frameworks, plugins parse framework-specific HTML comments, check for characteristic class naming patterns, and identify telltale asset paths.
Gotcha
WhatWeb's Ruby foundation brings elegance but creates performance ceilings. When scanning thousands of URLs with high thread counts, you'll notice the interpreter overhead compared to compiled alternatives like httpx. The cookie handling implementation, in particular, can become a bottleneck—disabling it with --no-errors speeds things up but might miss detections that rely on session behavior.
The aggressive modes (3-4) are powerful but dangerous in the wrong context. Level 4 can generate dozens of requests per target, probing for version-specific files and testing various paths. This traffic volume will absolutely trigger Web Application Firewalls, rate limiting, and intrusion detection systems. Using these modes without authorization isn't just unethical—it's likely illegal. Even with authorization, aggressive scanning of production systems should be scheduled during maintenance windows. The tool's power demands responsibility, and the documentation doesn't sufficiently emphasize this. Additionally, maintaining 1800+ plugins is a Sisyphean task. As web technologies evolve—frameworks update, CMSs change their fingerprints, services rebrand—some plugins inevitably lag behind. The WordPress plugin might be meticulously maintained, but that obscure industrial control system's web interface? The signatures could be years out of date.
Verdict
Use WhatWeb if you're conducting authorized penetration tests where comprehensive technology enumeration is critical, performing security assessments that require identifying vulnerable versions of web technologies, or doing OSINT reconnaissance where the stealth of level 1 scanning keeps you undetected. Its multi-level aggression system and extensive plugin library make it ideal for understanding exactly what's running on a target, from obvious frameworks down to analytics pixels and edge caching layers. The structured output formats integrate seamlessly into security operations workflows. Skip WhatWeb if you're scanning massive URL lists where speed trumps detection depth (compiled alternatives will be faster), you need real-time continuous monitoring rather than point-in-time reconnaissance (commercial services with APIs work better), you're doing any form of unauthorized scanning (the aggressive modes will get you caught and create legal liability), or you only need to answer simple binary questions like 'is this WordPress?' where simpler tools suffice. It's a surgical instrument for security professionals, not a casual exploration tool.