ZDNS: Building a DNS Scanner That Can Query 65 Million Domains Per Hour

Hook

When security researchers needed to scan every .com domain for DNS vulnerabilities, standard tools like dig choked. ZDNS was built to query 18,000 domains per second from a single machine.

Context

DNS research at Internet scale requires fundamentally different architectural tradeoffs than typical application DNS queries. When you're running certificate transparency monitoring, measuring DNS adoption of security features like DNSSEC across TLDs, or scanning for subdomain takeover vulnerabilities, you need to query millions of diverse domain names—and you need to do it fast.

Traditional DNS libraries optimize for the wrong use case. Application DNS resolvers cache aggressively because apps repeatedly query the same handful of domains. They're thread-safe because multiple goroutines share a single resolver instance. They handle retries gracefully and fail slowly because user experience matters more than raw throughput. But when you're scanning the entire IPv4 address space or processing ICANN zone files with millions of entries, these design choices become bottlenecks. ZDNS emerged from the ZMap Project at University of Michigan as purpose-built infrastructure for DNS measurement research, trading conventional best practices for raw scanning performance.

Technical Insight

System architecture — auto-generated

ZDNS's architecture inverts typical DNS library design around a core insight: when querying diverse names at scale, thread-safety overhead kills performance. Instead of one resolver protected by locks, ZDNS spawns multiple lightweight Resolver instances—one per concurrent query stream.

Here's how you'd structure a scan of 100,000 domains:

package main

import (
    "bufio"
    "encoding/json"
    "fmt"
    "os"
    "sync"
    "github.com/zmap/zdns/pkg/zdns"
)

func worker(id int, domains <-chan string, results chan<- interface{}, wg *sync.WaitGroup) {
    defer wg.Done()
    
    // Each worker gets its own Resolver instance - no locking needed
    config := zdns.NewResolverConfig()
    config.Nameservers = []string{"8.8.8.8:53", "1.1.1.1:53"}
    config.Timeout = 5
    
    resolver, err := zdns.NewResolver(config)
    if err != nil {
        return
    }
    
    for domain := range domains {
        // Module determines lookup behavior: A, AAAA, MX, etc.
        result, _, _ := resolver.DoLookup(domain, "A")
        results <- result
    }
}

func main() {
    domains := make(chan string, 1000)
    results := make(chan interface{}, 1000)
    var wg sync.WaitGroup
    
    // Spawn 100 workers, each with independent resolver
    for i := 0; i < 100; i++ {
        wg.Add(1)
        go worker(i, domains, results, &wg)
    }
    
    // Feed domains from file
    go func() {
        scanner := bufio.NewScanner(os.Stdin)
        for scanner.Scan() {
            domains <- scanner.Text()
        }
        close(domains)
    }()
    
    // Collect results
    go func() {
        wg.Wait()
        close(results)
    }()
    
    for result := range results {
        json.NewEncoder(os.Stdout).Encode(result)
    }
}

This pattern scales linearly with worker count until you hit network or DNS server limits. The key is that each Resolver maintains its own state, cache, and connection pool without synchronization primitives.

The module system provides the second architectural innovation. Rather than just returning raw DNS packets like dig, modules implement intelligent lookup chains. The "ALOOKUP" module follows CNAMEs automatically. The "MXLOOKUP" module not only retrieves MX records but resolves the IP addresses of mail servers in a single operation:

$ echo "google.com" | zdns MXLOOKUP --name-servers="8.8.8.8"
{
  "name": "google.com",
  "results": [
    {
      "name": "smtp.google.com",
      "preference": 10,
      "ipv4_addresses": ["142.250.80.27"],
      "ipv6_addresses": ["2607:f8b0:4023:1009::1b"]
    }
  ]
}

This JSON-centric output design reflects ZDNS's research origins. Measurements generate massive datasets that need parsing, statistical analysis, and database import. Structured JSON with nested resolution chains beats dig's human-friendly text format when you're processing gigabytes of query results.

The caching layer deserves special attention. ZDNS caches intermediate results during recursive resolution but assumes you're querying each name once. Cache hit rates on diverse name scans are naturally low, so ZDNS optimizes for fast misses rather than elaborate eviction policies. This is perfect for zone file processing but terrible for application traffic patterns.

ZDNS also supports per-domain nameserver specification through its input format, enabling zone transfer analysis and registrar-specific measurements:

$ echo '{"name": "example.com", "nameserver": "ns1.example.com:53"}' | zdns A --use-ns-input

This lets researchers query authoritative servers directly, bypassing recursive resolvers that might cache stale data or apply filtering. When measuring DNS propagation delays or comparing authoritative vs. recursive responses, this feature is essential.

Gotcha

The non-thread-safe Resolver design means you can't just drop ZDNS into an existing codebase as a DNS library replacement. If you're building a web service that needs occasional DNS lookups, spinning up dedicated resolver instances per request adds complexity without benefit. The standard library's net.Resolver or miekg/dns with proper locking will serve you better.

ZDNS's optimization for diverse name scanning creates surprising performance characteristics. If your workload involves repeatedly querying the same domains—like monitoring your own infrastructure or implementing a recursive resolver—ZDNS's cache won't help much. Tools like CoreDNS or PowerDNS Recursor are architecturally designed for query locality. ZDNS's academic research heritage also means operational concerns like metrics export, graceful degradation, and structured logging are minimal. You'll need to instrument heavily before running this in production monitoring pipelines. The tool assumes you're running batch jobs that complete and dump results, not long-lived services requiring observability.

Verdict

Use if: You're conducting DNS research at Internet scale (scanning TLDs, measuring protocol adoption, analyzing zone files), performing security assessments across thousands of domains, or building measurement infrastructure where querying diverse names quickly matters more than caching efficiency. The JSON output and module system make bulk analysis workflows significantly easier than parsing dig output. Skip if: You need a general-purpose DNS library for application development, require production-grade operational features and monitoring, or your query patterns favor caching (repeated lookups of the same domains). For those cases, stick with net.Resolver for simplicity or miekg/dns for protocol flexibility without the scanning-specific optimizations.

ZDNS: Building a DNS Scanner That Can Query 65 Million Domains Per Hour

ZDNS: Building a DNS Scanner That Can Query 65 Million Domains Per Hour

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

ZDNS: Building a DNS Scanner That Can Query 65 Million Domains Per Hour

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]