Back to Articles

ZDNS: Building a DNS Resolver That Scans a Million Domains Per Hour

[ View on GitHub ]

ZDNS: Building a DNS Resolver That Scans a Million Domains Per Hour

Hook

Most DNS resolvers optimize for the same 100 domains queried repeatedly. ZDNS does the opposite: it’s built to query a million different domains once, and it does so faster than tools designed for the common case.

Context

When security researchers or academics need to scan millions of domains—mapping the internet’s infrastructure, studying DNS hijacking, or analyzing certificate deployment—standard DNS tools fall apart. Tools like dig are interactive query tools, not batch processors. General-purpose DNS libraries optimize their caches for hot queries (think: resolving google.com thousands of times), not cold queries across millions of unique domains. Even purpose-built scanners like massdns prioritize raw speed over flexibility in query workflows.

ZDNS emerged from the ZMap Project at the University of Michigan, the same team behind the ZMap network scanner that can probe the entire IPv4 address space in under an hour. Their DNS research revealed that internet-scale measurements have unique requirements: you need structured JSON output for analysis pipelines, support for reading ICANN zone files directly, the ability to specify different nameservers per domain, and modules that automatically chase CNAME chains or resolve MX records to their final IP addresses. ZDNS packages these capabilities into both a Go library and CLI tool, validated by a peer-reviewed paper at ACM IMC ‘22 that demonstrates its performance characteristics for research workloads.

Technical Insight

Concurrency Model

creates instances

queries

miss

raw packets

responses

parsed data

cache updates

results

Input Sources

stdin/files/zone files

CLI Wrapper

zdns command

ResolverConfig

global settings

Resolver Pool

parallel instances

Shared Cache

DNS records

Lookup Modules

A/MX/AAAA/etc

zmap/dns Library

packet parsing

Nameservers

upstream/root

JSON Output

structured results

System architecture — auto-generated

ZDNS’s architecture centers on separating configuration from execution. The ResolverConfig struct holds global settings—timeout values, retry strategies, root hints, cache parameters—while individual Resolver instances perform the actual lookups. This design choice has a crucial implication: parallelism happens by creating multiple Resolver instances, not through internal goroutines. You control concurrency explicitly:

import "github.com/zmap/zdns"

config := zdns.ResolverConfig{
    Timeout:        10 * time.Second,
    IterativeTimeout: 4 * time.Second,
    CacheSize:      10000,
    RootServers:    zdns.DefaultRootServers,
}

// Create a pool of resolvers for parallel lookups
resolvers := make([]*zdns.Resolver, 10)
for i := range resolvers {
    resolvers[i] = zdns.NewResolver(&config)
}

// Distribute work across the pool
domains := []string{"example.com", "github.com", "starlog.dev"}
for i, domain := range domains {
    resolver := resolvers[i%len(resolvers)]
    result, _, _ := resolver.PerformLookup("A", domain)
    // result contains structured JSON-serializable data
}

This explicit concurrency model might seem verbose compared to libraries that hide parallelism behind thread pools, but it gives you precise control over resource allocation—critical when you’re managing thousands of simultaneous lookups and need to avoid overwhelming upstream nameservers or your own network stack.

The caching strategy reveals ZDNS’s research focus. Instead of optimizing for cache hit rates on popular domains, it sizes the cache to hold intermediate results during recursive resolution of many different domains. When resolving mail.engineering.example.com, you might query the root servers for .com, the .com nameservers for example.com, the example.com nameservers for engineering.example.com, and finally get to mail.engineering.example.com. Those intermediate NS records get cached, but ZDNS doesn’t expect to see mail.engineering.example.com again—it expects the next query to be blog.marketing.different-company.org. The cache optimizes for breadth, not depth.

ZDNS’s module system differentiates between raw DNS operations and higher-level lookup workflows. Raw modules like A, AAAA, MX, or TXT return exactly what the nameserver sends, similar to dig. Lookup modules like ALOOKUP or MXLOOKUP implement complete resolution logic:

# Raw module: just get MX records
echo "gmail.com" | zdns MX
# Returns: MX records pointing to google mail servers

# Lookup module: get MX records AND resolve them to IPs
echo "gmail.com" | zdns MXLOOKUP
# Returns: MX records plus A/AAAA records for each mail server

This separation lets you choose between minimal data (faster, less bandwidth) and complete pictures (more context, fewer subsequent queries). For research measuring mail server deployment, MXLOOKUP gives you the full infrastructure in one pass.

Input flexibility sets ZDNS apart from simpler scanners. Beyond reading domains from stdin, it parses ICANN CZDS zone files directly, supports per-domain nameserver specification (formatted as domain,nameserver), and handles mixed input types. This matters when you’re analyzing DNS delegation: you can feed ZDNS a zone file containing all .com domains and their authoritative nameservers, then query each domain using its designated nameserver rather than starting from root hints. This tests whether authoritative servers are responding correctly, not just whether recursive resolution works.

Gotcha

The Resolver thread-safety limitation catches developers by surprise. Each Resolver instance maintains internal state during recursive lookups, making concurrent calls to the same instance unsafe. The documentation states this clearly, but the natural instinct when seeing a library is to create one resolver and share it across goroutines. You must either create resolver pools (as shown above) or implement your own locking, which negates ZDNS’s performance benefits. This design makes sense for the CLI use case where each worker goroutine gets its own resolver, but library users need to architect around it.

The cache isn’t tunable for different workload profiles. The CacheSize parameter sets maximum entries, but you can’t configure eviction policies or separate caching strategies for different record types. If you’re doing a mixed workload—say, resolving A records for millions of domains but also repeatedly checking NS records for a smaller set of infrastructure domains—you can’t tell ZDNS to prioritize caching NS records. The one-size-fits-all cache works for the intended research use case but limits adaptability.

Documentation incompleteness shows through in places. The README mentions per-module triggers for advanced input filtering but cuts off mid-explanation. Some module behaviors are documented only in code comments or the academic paper rather than user-facing docs. For a tool positioning itself as both library and research instrument, this creates friction: practitioners want comprehensive API docs, while researchers expect citation-quality specification. ZDNS falls between, requiring source code diving to understand edge cases.

Verdict

Use ZDNS if you’re conducting internet measurement research, performing security assessments across thousands of domains, analyzing DNS infrastructure deployment, or building data pipelines that ingest zone files and need structured JSON output. It’s the right tool when you’re scanning at scale—100,000+ domains—and need both performance and specialized features like per-domain nameserver specification or automatic CNAME chasing. The explicit concurrency model and module flexibility justify the complexity when your workload matches the design assumptions. Skip ZDNS for interactive queries, small-scale lookups (<1,000 domains), real-time applications, or general-purpose DNS needs in application code. The learning curve and manual concurrency management aren’t worth it when standard libraries like miekg/dns for Go or dnspython provide simpler interfaces, or when command-line tools like dig handle your interactive workflows. If you’re not analyzing internet-scale data or building research infrastructure, ZDNS is over-engineered for your needs.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/zmap-zdns.svg)](https://starlog.is/api/badge-click/developer-tools/zmap-zdns)