Back to Articles

httpx: How ProjectDiscovery Built a Resilient HTTP Reconnaissance Pipeline

[ View on GitHub ]

httpx: How ProjectDiscovery Built a Resilient HTTP Reconnaissance Pipeline

Hook

While most HTTP clients fail fast, httpx is designed to fail gracefully—automatically downgrading from HTTPS to HTTP, backing off when it detects WAFs, and retrying failed requests with exponential delays. It’s an HTTP toolkit built for hostile environments.

Context

Security reconnaissance traditionally required chaining multiple tools: curl for basic probing, openssl for certificate inspection, custom scripts for title extraction, and separate tools for technology fingerprinting. Bug bounty hunters and penetration testers would spend hours writing bash pipelines to probe thousands of hosts, only to have their scans blocked by WAFs or derailed by inconsistent SSL configurations.

ProjectDiscovery built httpx to collapse this fragmented workflow into a single, resilient tool. Unlike general-purpose HTTP clients designed for well-behaved APIs, httpx assumes the worst: unreliable networks, aggressive rate limiting, misconfigured TLS certificates, and infrastructure that actively resists reconnaissance. It’s part of ProjectDiscovery’s broader security toolkit ecosystem, designed to feed data into analysis pipelines alongside tools like subfinder and nuclei.

Technical Insight

Probes

Hosts/URLs

Parallel requests

HTTPS attempt

Yes

No

Retry HTTP

Status, headers, body

Screenshots, JS exec

Browser data

URL/Host Input

Concurrent Pipeline

Retryable HTTP Client

HTTPS Failed?

HTTP Fallback Probe

Modular Probe System

Headless Browser

JSON/Text Output

System architecture — auto-generated

At its core, httpx wraps the retryablehttp library with a concurrent pipeline architecture that prioritizes reliability over raw speed. The tool doesn’t just send HTTP requests—it orchestrates a sequence of fallback strategies that mirror how experienced pentesters manually probe infrastructure.

The auto-fallback mechanism is particularly elegant. When httpx encounters an HTTPS failure, it automatically attempts HTTP before marking a host as unreachable. This isn’t just error handling—it’s reconnaissance logic baked into the transport layer. Combined with configurable retry logic and backoff timers, httpx can work around WAFs that block aggressive scanning patterns. The tool essentially codifies the manual retry patterns security researchers use when probing defensive infrastructure.

The probe system is where httpx’s modular design shines. Rather than hardcoding response parsing, httpx treats each data point (status code, title, certificate, favicon hash) as an independent probe that can be enabled or disabled. Here’s a practical example of extracting multiple data points from a list of hosts:

# Probe hosts with title, tech detection, and response times
cat hosts.txt | httpx -title -td -rt -json -o results.json

# Extract favicon hashes and JARM fingerprints
cat hosts.txt | httpx -favicon -jarm -silent

# Process CIDR ranges on multiple ports (exact flag syntax may vary)
echo 192.168.1.0/24 | httpx -silent

The JSON output mode (enabled with -json) transforms httpx from an interactive tool into a pipeline component. Each probe result is structured data that can feed into downstream analysis tools. The -tech-detect flag runs Wappalyzer fingerprinting and outputs structured technology stacks—WordPress versions, JavaScript frameworks, CDN providers—all discoverable through HTTP response patterns.

Input flexibility is another architectural strength. httpx accepts not just URLs, but CIDR ranges (as documented in features), raw HTTP requests from tools like Burp Suite (via -request flag), and lists of hosts. This means you can feed it the output of subdomain enumeration tools directly.

The tool also supports advanced probes like JARM fingerprinting (via -jarm flag), favicon hashing using mmh3 (the same algorithm Shodan uses, accessible via -favicon), and ASN lookups for infrastructure mapping. The -hash flag can compute multiple body hashes simultaneously (md5, mmh3, simhash, sha1, sha256, sha512), useful for detecting duplicate content or tracking infrastructure changes over time.

What’s less obvious from the CLI interface is httpx’s handling of edge cases. It includes built-in CDN detection (available as a probe), follows redirect chains while preserving the full path (redirect chain probe), and appears to support HTTP pipelining detection based on the probe list. Various status and diagnostic flags help debug reconnaissance workflows at scale.

Gotcha

The README’s security warning deserves emphasis: httpx is explicitly not designed to run as a persistent service. It’s a CLI tool for controlled reconnaissance, not a web application component. The aggressive retry logic and automatic fallbacks that make it excellent for security testing would be vulnerabilities in a service context—imagine auto-downgrading from HTTPS to HTTP in a production API client.

Active development means breaking changes between releases. The disclaimer about reviewing changelogs isn’t boilerplate—flag names, output formats, and default behaviors shift as the tool evolves. If you’re embedding httpx in automation pipelines, version pinning is mandatory.

The headless browser features (screenshots via -screenshot, JavaScript execution via -javascript-code) are powerful but resource-intensive. Running these probes against thousands of hosts will quickly exhaust memory and CPU. They’re designed for targeted use, not broad sweeps.

Finally, while httpx handles many probe types, it’s not a replacement for specialized tools. For JavaScript-heavy applications requiring full browser interaction, you’ll still need Puppeteer or Playwright. For low-level network scanning, nmap remains superior. httpx occupies the HTTP reconnaissance niche specifically. Note that some advanced flags like custom port scanning or path fuzzing may require consulting the full documentation, as the README shows capabilities (like ‘Paths’ and ‘Ports’ probes) whose exact CLI syntax isn’t fully detailed in the usage section.

Verdict

Use httpx if you’re doing security reconnaissance, bug bounty hunting, or infrastructure discovery at scale. It excels when you need to probe thousands of hosts with configurable data extraction, automatic fallback handling, and pipeline integration with other security tools. The structured JSON output and extensive probe collection make it ideal for automated workflows that feed into analysis platforms.

Skip it if you need API stability guarantees for long-term automation, want to run HTTP probing as a persistent service, or need a general-purpose HTTP debugging tool for application development. Also avoid if you’re just testing a handful of endpoints interactively—curl or httpie will be faster and simpler.

httpx’s value proposition is resilient, large-scale HTTP reconnaissance with security-focused probes, not everyday HTTP client work. For maximum effectiveness, consult the full documentation at docs.projectdiscovery.io/tools/httpx/ to understand all available flags and their exact syntax, as the README provides an overview but not exhaustive flag details.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/cybersecurity/projectdiscovery-httpx.svg)](https://starlog.is/api/badge-click/cybersecurity/projectdiscovery-httpx)