Back to Articles

ksubdomain: How Stateless DNS Brute-Forcing Achieves 1.6 Million Packets Per Second

[ View on GitHub ]

ksubdomain: How Stateless DNS Brute-Forcing Achieves 1.6 Million Packets Per Second

Hook

Traditional DNS resolvers hit a wall around 10,000 requests per second due to file descriptor limits. ksubdomain doesn't use DNS resolvers at all—it speaks raw packets at 160x that speed.

Context

Subdomain enumeration is a critical first step in security reconnaissance, bug bounty hunting, and attack surface mapping. Traditional approaches fall into two camps: passive enumeration via certificate transparency logs and search engine APIs (tools like subfinder and Amass), or active brute-forcing by iterating through wordlists and querying DNS servers. The passive approach is stealthy but incomplete—you only find what's already been indexed or logged. Active brute-forcing gives comprehensive coverage but faces a brutal performance bottleneck.

The problem is architectural. Standard DNS libraries in Python, Ruby, or even Go's net package create sockets for each query, relying on the operating system's network stack to handle TCP/IP protocol details. This convenience comes at a cost: each socket consumes a file descriptor, and OS kernels impose hard limits (typically 1024-65536). Even with clever connection pooling and async I/O, you hit a ceiling around 10,000-50,000 queries per second. For large-scale reconnaissance across thousands of domains or deep multi-level subdomain enumeration (think admin.staging.internal.company.com), this speed is painfully inadequate. Tools like massdns improved throughput by being smarter about UDP and DNS protocol handling, but still played by the OS network stack rules. ksubdomain took a different approach: ignore the rules entirely.

Technical Insight

ksubdomain's architecture borrows from port scanners like masscan: it completely bypasses the operating system's network stack by crafting and injecting raw packets directly onto the network interface via libpcap. When you launch ksubdomain, it doesn't open DNS sockets—it opens a single pcap handle to your network card and starts spraying DNS query packets at line rate, limited only by your bandwidth and NIC capabilities.

The core innovation is statelessness. Traditional DNS clients send a query, wait for a response, match it via transaction ID, then send the next query. ksubdomain decouples sending and receiving into separate goroutines. The sender thread generates DNS query packets with unique transaction IDs, serializes them to raw bytes (Ethernet frame + IP header + UDP header + DNS payload), and pushes them through pcap as fast as possible. Meanwhile, a completely independent receiver thread captures all incoming packets on port 53, parses DNS responses, and logs successful subdomain resolutions. There's no state synchronization between sender and receiver during the flood—hence "stateless."

Here's a simplified example of how the packet construction works:

func buildDNSQuery(domain string, txID uint16) []byte {
    // Construct DNS query packet
    dnsQuery := &layers.DNS{
        ID:      txID,
        QR:      false, // Query
        OpCode:  layers.DNSOpCodeQuery,
        RD:      true,  // Recursion desired
        QDCount: 1,
        Questions: []layers.DNSQuestion{{
            Name:  []byte(domain),
            Type:  layers.DNSTypeA,
            Class: layers.DNSClassIN,
        }},
    }
    
    // Layer stack: Ethernet -> IP -> UDP -> DNS
    buffer := gopacket.NewSerializeBuffer()
    opts := gopacket.SerializeOptions{FixLengths: true, ComputeChecksums: true}
    
    gopacket.SerializeLayers(buffer, opts,
        &layers.Ethernet{...},
        &layers.IPv4{...},
        &layers.UDP{SrcPort: 53, DstPort: 53},
        dnsQuery,
    )
    
    return buffer.Bytes()
}

The tool maintains a transaction ID mapping in memory (domain -> txID), but this is only consulted when receiving responses, never during the send loop. This means the sender can achieve 1.6 million packets per second on Linux without blocking—it's essentially a packet generator that happens to speak DNS protocol.

The stateless approach introduces a reliability problem: packets will be lost at that rate. ksubdomain solves this with a clever retransmission mechanism. After the initial flood completes, it checks which domains didn't receive responses (by comparing sent txIDs against received ones) and retransmits those subsets at progressively slower rates. The default is three rounds: first at full speed, then at 50%, then at 10%. This multi-pass strategy balances speed with coverage—most domains resolve in the first pass, stragglers catch up in subsequent rounds.

Wildcard DNS detection is another technical highlight. Many organizations configure wildcard records (*.example.com -> 1.2.3.4) that would otherwise flood results with false positives. ksubdomain tests random subdomains before brute-forcing, builds a fingerprint of wildcard response patterns (IP addresses, CNAME chains), and filters matching results during enumeration. For more sophisticated wildcards that rotate IPs, it uses statistical analysis—if more than 90% of tested subdomains resolve to a small set of IPs, it flags those as wildcards.

The bandwidth auto-throttle feature (-b flag) is worth highlighting for production use. You specify a target bandwidth like "-b 100M" and ksubdomain dynamically adjusts packet rate to stay under that limit. This is crucial for avoiding network saturation or triggering DDoS protection systems. The implementation estimates packet sizes (DNS queries are typically 60-100 bytes depending on domain length), calculates packets-per-second needed to hit the bandwidth target, and inserts microsecond sleeps in the send loop:

// Simplified throttle logic
packetSize := 80 // average bytes
targetBytesPerSec := bandwidthMbps * 125000 // Mbps to bytes/sec
packetsPerSec := targetBytesPerSec / packetSize
delayNs := 1e9 / packetsPerSec // nanoseconds between packets

for domain := range wordlist {
    packet := buildDNSQuery(domain, nextTxID())
    pcapHandle.WritePacketData(packet)
    time.Sleep(time.Duration(delayNs) * time.Nanosecond)
}

The Lua plugin system extends the tool beyond brute-forcing. Plugins can call external APIs (Censys, VirusTotal, SecurityTrails), parse results, and feed discovered subdomains back into the enumeration pipeline. This enables hybrid workflows: start with passive sources via Lua scripts, then aggressively brute-force gaps in coverage. The plugin interface is simple—Lua functions receive domain strings and return subdomain arrays, which ksubdomain merges with wordlist results.

Gotcha

The pcap dependency is a double-edged sword. You need libpcap-dev (Linux), WinPcap (Windows), or the macOS equivalent installed, plus appropriate permissions (usually root/admin) to open raw sockets on network interfaces. This makes deployment more complex than statically-compiled Go binaries—you can't just scp a single executable to a server and run it. Cross-compilation is also broken because the pcap library bindings are platform-specific; you must build on the target OS architecture.

Performance claims need context. The 1.6M packets/second figure is achievable on bare metal Linux with good NICs and direct internet connections, but many real-world scenarios kill that speed. Residential ISPs often rate-limit UDP or flag aggressive DNS traffic as malicious. Corporate networks route through proxy infrastructure that can't handle the packet rate. Even VPS providers sometimes throttle outbound DNS queries above certain thresholds. If you're behind NAT or using VPN tunnels, packet loss skyrockets and the retransmission rounds become less effective. In practice, you often need to dial down to 50K-100K packets/second with bandwidth limiting enabled, which still beats traditional tools but isn't the headline number.

The Chinese-language documentation and error messages are barriers for international users. While the README has English sections covering basic usage, troubleshooting requires translating output or navigating Chinese-language issues on GitHub. The tool's popularity in the Chinese security community means most discussion, tutorials, and advanced usage examples are in Chinese forums. This isn't insurmountable with machine translation, but it adds friction compared to tools with native English documentation ecosystems.

Verdict

Use if: You're conducting large-scale reconnaissance across hundreds or thousands of domains where speed directly impacts project timelines (penetration tests with tight deadlines, bug bounty automation, continuous asset discovery for large organizations). You have suitable infrastructure—ideally a dedicated server or VPS with high bandwidth allowances and lenient outbound UDP policies. You're comfortable with command-line complexity and troubleshooting pcap-related issues. You need multi-level subdomain enumeration (levels parameter set to 2-3) where the exponential subdomain space makes slower tools impractical. Skip if: You're working from restrictive network environments like corporate LANs, residential ISPs with strict rate limiting, or regions where aggressive DNS queries attract unwanted attention. Your reconnaissance needs are modest—single domains or small scopes where subfinder's passive API aggregation gives 80% coverage in 30 seconds. You prefer tools with simpler deployment models (no root required, no external dependencies) and comprehensive English documentation. You lack the bandwidth or infrastructure to actually benefit from extreme speeds. In those cases, stick with Amass for depth, subfinder for ease, or puredns as a middle ground with better wildcard handling and simpler setup.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/cybersecurity/knownsec-ksubdomain.svg)](https://starlog.is/api/badge-click/cybersecurity/knownsec-ksubdomain)