Back to Articles

RWhoisd: The Hierarchical Whois Server That Powers Internet Registry Infrastructure

[ View on GitHub ]

RWhoisd: The Hierarchical Whois Server That Powers Internet Registry Infrastructure

Hook

When you query WHOIS for IP allocation data, there's a decent chance your request is being routed through a protocol most developers have never heard of: RWhois, the hierarchical referral system that predates modern RESTful APIs by decades.

Context

The standard WHOIS protocol, dating back to the early 1980s, was designed as a simple directory service for network resource information. But as the internet scaled and organizations needed to manage their own IP allocations, a fundamental problem emerged: centralized WHOIS servers became bottlenecks, and organizations had no way to maintain authoritative control over their own delegated address space.

Referral WHOIS (RWhois), standardized in RFC 2167 in 1997, solved this through hierarchical delegation. Instead of forcing all queries through a single authority, RWhois allows organizations to run their own servers that publish data about resources they control—IP blocks, ASNs, domain contacts. Parent servers can refer queries to child authorities automatically, creating a distributed tree of network resource information. The rwhoisd project, developed by Network Solutions and later maintained by ARIN (American Registry for Internet Numbers), became the reference implementation. While the protocol never achieved widespread adoption outside of regional internet registries and ISPs, it remains critical infrastructure for organizations that need to publish authoritative IP allocation data within the internet registry system.

Technical Insight

RWhois Protocol

Tokenized Query

Startup

Hash Tables

IP/CIDR/Names

Lookup Key

Match Found

No Match/Delegated

Authorized

Denied

Network/Contact

Domain/ASN Records

Redirect to

Child Server

Client Query

Query Parser

Query Engine

Flat File Database

Authority Areas

Index Builder

In-Memory Index

Result Builder

Referral Handler

Access Control

System architecture — auto-generated

At its core, rwhoisd operates as a lightweight daemon that serves structured data from flat text files organized in a directory hierarchy. Unlike modern database-backed systems, it uses a schema-based file format where each record type (network, contact, domain, ASN) follows a specific structure with attribute-value pairs. Data is organized into "authority areas"—essentially namespaces that define what portion of the resource tree a particular server is authoritative for.

The database structure looks something like this for a network allocation record:

network:ID:NET-192-0-2-0
network:Network-Name:Example-Network
network:IP-Network:192.0.2.0/24
network:IP-Network-Block:192.0.2.0 - 192.0.2.255
network:Org-Name:Example Corporation
network:Street-Address:123 Network Lane
network:City:San Francisco
network:State:CA
network:Postal-Code:94105
network:Country-Code:US
network:Tech-Contact:NOC@example.com
network:Updated:20240115
network:Class-Name:network

The server builds indexes from these files at startup, creating hash tables for efficient lookups by various keys—IP addresses, CIDRs, network names, or contact handles. When a query arrives, the parser tokenizes it and determines the query type. For IP lookups, it performs longest-prefix matching to find the most specific allocation. If the query falls within an authority area that's been delegated to another server, rwhoisd returns a referral response directing the client to query the child server.

Configuration is handled through an rwhois.conf file that defines authority areas, access control, and server behavior. A minimal configuration establishes the server's authority scope:

root-password: secretpassword
server-contact: hostmaster@example.com
default-ttl: 86400

# Define an authority area
authority-area: 192.0.2.0/24
attribute: network
attribute: contact
attribute: domain
data-dir: /var/rwhois/data/192.0.2

The access control system allows fine-grained restrictions based on IP addresses, query types, and even specific attributes. This was crucial for registry operators who needed to publish some data publicly while restricting sensitive information like detailed contact data. The ACL syntax permits both allow and deny rules with CIDR matching:

access: 203.0.113.0/24 deny all
access: 198.51.100.0/24 allow all
access: 0.0.0.0/0 allow network,domain deny contact

Under the hood, rwhoisd implements the RWhois protocol's extended query syntax, which goes beyond simple string matching. Queries can include wildcards, attribute-specific searches, and boolean operators. A query like network;IP-Network=192.0.2.* searches specifically within the network attribute for matching IP blocks, while contact;name=*Smith finds contact records. The server parses these queries using a hand-written lexer that tokenizes the query string, then routes to appropriate index lookups.

The IPv6 support, added in later versions, required significant changes to address parsing and storage. IPv6 addresses use 128-bit representation, and the longest-prefix matching algorithm needed modifications to handle the expanded address space efficiently. The code uses bitwise operations on address structures to perform rapid subnet containment checks, critical for performance when handling large routing tables.

One architectural decision worth noting: rwhoisd uses a pre-fork model where a master process spawns worker children that each handle connections. This was common in the late 90s (Apache 1.3 used similar design) but predates modern event-driven architectures. Each worker maintains its own copy of the index in memory, which means updates require restarting the server or sending signals to trigger index rebuilds—there's no dynamic reloading of data.

Gotcha

The file-based storage model hits hard performance limits with large datasets. Each authority area's data is read and indexed at startup, and for organizations managing millions of IP allocations, this means multi-gigabyte memory footprints and startup times measured in minutes. There's no query caching, no incremental index updates, and no replication built into the protocol—high availability requires external load balancing and homegrown synchronization scripts.

More critically, the C codebase carries significant technical debt. The code predates modern security practices—string handling uses older unsafe functions in places, and while ARIN has patched known vulnerabilities, the attack surface of a C daemon parsing untrusted network input is inherently concerning. Buffer overflow protections rely on compiler features rather than memory-safe language guarantees. The RWhois protocol itself also lacks encryption or authentication at the protocol level; any security must be layered on through firewalls or tunnels. For production deployments, expect to invest heavily in security hardening, monitoring, and likely running it behind a reverse proxy that handles TLS termination and rate limiting. The project's low GitHub activity—31 stars and minimal recent commits—signals that active development has essentially ceased, making it more of a maintenance-mode infrastructure component than an evolving tool.

Verdict

Use if: You're operating as an ISP, hosting provider, or LIR (Local Internet Registry) that needs to publish IP allocation data as part of your responsibilities to a regional internet registry (RIR), or you're maintaining existing infrastructure that already uses RWhois for hierarchical resource delegation and need a stable, proven implementation. It's also appropriate if you need lightweight, file-based resource publishing where data updates are infrequent and query volume is moderate. Skip if: You're building new infrastructure for IP resource management—modern RDAP (Registration Data Access Protocol) provides better standardization, JSON responses, and wider tooling support. Also skip if you need high-performance query handling, dynamic data updates, or you're uncomfortable maintaining C codebases with security implications. For general directory services or simple WHOIS functionality without hierarchical referral, use simpler WHOIS servers or just implement HTTP APIs directly. If you're managing IP resources internally without registry publication requirements, full IPAM solutions like NetBox provide far better user experience and integration capabilities.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/arineng-rwhoisd.svg)](https://starlog.is/api/badge-click/developer-tools/arineng-rwhoisd)