Back to Articles

SubdomainDB: The Self-Hosted API Bug Bounty Hunters Use to Track Reconnaissance Data

[ View on GitHub ]

SubdomainDB: The Self-Hosted API Bug Bounty Hunters Use to Track Reconnaissance Data

Hook

Most bug bounty hunters leak their target intelligence to third-party services without realizing it. Every subdomain you discover and store externally is a potential OPSEC failure that could alert competitors or even the target itself.

Context

Security reconnaissance generates massive amounts of subdomain data. When you're enumerating attack surfaces for bug bounties or penetration tests, you might discover thousands of subdomains across dozens of targets over weeks or months. Tools like subfinder, amass, and assetfinder excel at discovery, but they don't solve the persistence problem.

Historically, researchers have used flat files, spreadsheets, or cloud note-taking apps to track findings. Flat files become unwieldy at scale and lack query capabilities. Cloud services introduce OPSEC risks—your reconnaissance data sits on someone else's infrastructure, potentially visible to the service provider or vulnerable to breaches. SubdomainDB emerged from this gap: a need for queryable, persistent storage that remains entirely under your control. It's not trying to be a full-featured asset management platform; it's deliberately minimal, giving you just enough structure to organize findings without the overhead of enterprise tools.

Technical Insight

HTTP Requests

JSON payloads

CRUD Operations

SQL Queries

Domain Records

timestamps

Model Objects

JSON Responses

HTTP Status

Security Tools/

Shell Scripts

Sinatra REST API

Port 4567

Route Handlers

GET/POST/DELETE

ActiveRecord ORM

SQLite Database

domains table

System architecture — auto-generated

SubdomainDB is built on Sinatra, the minimalist Ruby web framework that sits atop Rack. The entire application is likely under 200 lines, making it auditable in a single sitting. It uses ActiveRecord as an ORM layer over SQLite, giving you relational database capabilities without the operational overhead of PostgreSQL or MySQL.

The architecture is straightforward: a single domains table stores subdomain strings with timestamps, and RESTful endpoints expose CRUD operations. Here's what a typical interaction looks like:

# Add a discovered subdomain
curl -X POST http://localhost:4567/domains \
  -H "Content-Type: application/json" \
  -d '{"domain": "api.example.com"}'

# Query all subdomains for a root domain
curl http://localhost:4567/domains/search?query=example.com

# List everything
curl http://localhost:4567/domains

# Delete a record
curl -X DELETE http://localhost:4567/domains/123

The Sinatra routing is dead simple. The application defines routes like post '/domains' that accept JSON payloads, instantiate ActiveRecord models, and return JSON responses. Error handling is minimal—you get HTTP status codes and basic messages, not verbose error objects. This simplicity is intentional; it's meant to be integrated into shell scripts and automation workflows where you want predictable behavior without parsing complex responses.

Deployment options reflect this philosophy. You can run it directly with ruby app.rb for local use, or containerize it with Docker for deployment on a VPS behind a VPN. The SQLite database is just a file on disk, making backups trivial—copy the .db file and you've got your entire reconnaissance history. For integration into reconnaissance pipelines, you might wrap API calls in a shell function:

# Wrapper function for subdomain enumeration pipeline
function enum_and_store() {
    local domain=$1
    subfinder -d $domain -silent | while read subdomain; do
        curl -s -X POST http://localhost:4567/domains \
          -H "Content-Type: application/json" \
          -d "{\"domain\": \"$subdomain\"}" > /dev/null
    done
    echo "Stored subdomains for $domain"
}

The ActiveRecord models are presumably bare-bones—a Domain class with validations for presence and possibly format. The lack of built-in deduplication means you'll want to add unique constraints at the database level or handle it in your client code. Since the schema is SQLite, you can easily extend it with additional columns for metadata like discovery_date, tool_used, or status flags for further analysis.

What makes this architecture appropriate for reconnaissance work is the stateless API design. You can run enumeration tools on multiple machines, all reporting back to a central SubdomainDB instance. The SQLite backend is sufficient for typical bug bounty workloads—unless you're doing nation-state-level enumeration with hundreds of concurrent scanners, you won't hit performance walls. For most researchers tracking dozens of programs with thousands of subdomains each, it's plenty fast.

Gotcha

The repository README includes a prominent warning: SubdomainDB has no authentication or authorization built in. This is not a minor limitation—it's a critical security gap. If you expose this API to the internet without adding your own auth layer, anyone who discovers the endpoint can read your entire reconnaissance database, add garbage data, or delete everything. Even on a local network, you're vulnerable to lateral movement attacks if an attacker compromises another system.

Adding authentication isn't trivial if you're not familiar with Rack middleware. You'd need to implement something like HTTP Basic Auth, API key validation in a before filter, or integrate a gem like warden or devise. For quick-and-dirty protection, you might run it behind nginx with basic auth, but that's another moving part to configure. The project assumes you'll handle this yourself, which is fine for experienced developers but risky for those who might deploy it naively.

The SQLite backend also has scaling limitations. Concurrent writes can cause locking issues, though for typical reconnaissance workflows where you're writing discovered subdomains sequentially, this rarely matters. The real limitation is query performance at extreme scale—if you're storing millions of subdomains with complex search patterns, full-text search capabilities of PostgreSQL or Elasticsearch would serve you better. There's also no built-in data validation beyond basic ActiveRecord checks, so malformed input or injection attempts could cause issues if you're not sanitizing data before submission.

Verdict

Use SubdomainDB if you're a security researcher, penetration tester, or bug bounty hunter who needs simple, self-hosted subdomain tracking for personal or small-team use, and you're comfortable adding your own authentication layer or running it in an already-secure environment like a VPN or localhost-only setup. It's perfect for integrating into automation scripts where you want a queryable data store without the overhead of setting up PostgreSQL or learning Elasticsearch. Skip it if you need production-ready security out of the box, multi-user access controls, or you're working at scale with millions of records and concurrent users. Also skip it if you're not technically proficient enough to secure it yourself—exposing this to the internet without modifications is asking for trouble. For those cases, invest time in ProjectDiscovery's Chaos API or build a proper asset management system with authentication baked in from the start.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/data-knowledge/smiegles-subdomaindb.svg)](https://starlog.is/api/badge-click/data-knowledge/smiegles-subdomaindb)