SubGPT: Using Bing's AI to Discover Subdomains That Pattern-Based Tools Miss

Hook

What if the future of subdomain discovery isn't faster brute-forcing, but asking an AI to think like the engineers who created them?

Context

Subdomain enumeration has long been a critical first step in reconnaissance for security researchers, penetration testers, and bug bounty hunters. For years, the methodology has remained relatively static: you gather seeds from certificate transparency logs, passive DNS databases, and search engines, then feed them into permutation engines that apply predictable patterns like adding 'dev-', 'staging-', or 'api-' prefixes. Tools like dnsgen, gotator, and alterx excel at this mechanical approach, generating thousands of candidates per second based on wordlists and regex patterns.

But this deterministic approach has a blind spot. Modern organizations don't always follow naming conventions. A company might use 'canary-prod-analytics' instead of 'analytics-staging', or 'customer-facing-api-v2' instead of 'api2'. Pattern-based tools struggle with these creative variations because they're bound by their ruleset. SubGPT represents a fundamentally different philosophy: instead of applying rigid patterns, it uses Large Language Models to understand the semantic meaning behind existing subdomains and generate contextually relevant variations that a human engineer might actually create.

Technical Insight

SubGPT's architecture is deceptively simple but clever in its execution. At its core, it's a three-stage pipeline: seed collection, LLM-powered generation, and DNS validation. What makes it interesting is how it bridges the gap between conversational AI interfaces and traditional reconnaissance workflows.

The tool begins by accepting a list of discovered subdomains from your existing enumeration process. These seeds are crucial—SubGPT isn't a standalone discovery tool but an enrichment layer that requires context to be effective. The more diverse and comprehensive your input seeds, the better the AI can infer organizational patterns. You might feed it results from subfinder, amass, or certificate transparency scraping:

# Example seed file (seeds.txt)
api.example.com
dev-api.example.com
staging-auth.example.com
prod-analytics.example.com
customer-dashboard.example.com

The interesting part happens next. SubGPT uses the EdgeGPT library to interact with Bing's ChatGPT interface, constructing a prompt that asks the model to analyze patterns and generate new subdomain candidates. This isn't a traditional API call—it's literally having a conversation with Bing's chat interface, which is why authentication cookies are required. The tool reads your Bing cookies (stored in cookies.json) to authenticate as if you were using the web interface yourself:

# Simplified conceptual flow from SubGPT's approach
from EdgeGPT import Chatbot

async def generate_subdomains(seeds, domain):
    bot = await Chatbot.create(cookies=load_cookies())
    
    prompt = f"""Given these existing subdomains for {domain}:
    {', '.join(seeds)}
    
    Generate 20 additional subdomain variations that follow similar
    patterns and naming conventions. Focus on realistic variations
    that developers might actually use."""
    
    response = await bot.ask(prompt=prompt)
    candidates = parse_subdomains_from_response(response)
    
    await bot.close()
    return candidates

What's fascinating is that the LLM can pick up on organizational conventions that aren't easily codified into regex patterns. If your seeds include 'customer-facing-api-v2' and 'internal-facing-dashboard-v1', a pattern-based tool would struggle to generate 'partner-facing-portal-v3' or 'vendor-facing-integration-v2'. But an LLM trained on millions of code repositories and documentation can infer that 'facing' indicates audience, 'customer/internal/partner/vendor' are audience types, and 'v1/v2/v3' are version indicators—then recombine them creatively.

The final stage is validation with built-in intelligence. SubGPT doesn't blindly accept the LLM's output. It performs DNS resolution on each candidate, checking for A and CNAME records to confirm the subdomain actually exists. More importantly, it includes wildcard detection specifically for first-level subdomains. Many organizations configure wildcard DNS (*.example.com) that resolves everything to a default page, creating massive false positive noise. SubGPT tests for this by resolving a random subdomain and comparing results:

# Wildcard detection logic concept
import dns.resolver
import random
import string

def is_wildcard(domain):
    random_subdomain = ''.join(random.choices(string.ascii_lowercase, k=20))
    test_domain = f"{random_subdomain}.{domain}"
    
    try:
        control_ip = dns.resolver.resolve(test_domain, 'A')[0].to_text()
        return control_ip  # If random subdomain resolves, likely wildcard
    except:
        return None

def validate_subdomain(subdomain, wildcard_ip):
    try:
        resolved_ip = dns.resolver.resolve(subdomain, 'A')[0].to_text()
        if wildcard_ip and resolved_ip == wildcard_ip:
            return False  # Skip wildcard matches
        return True
    except:
        return False

This validation layer is critical because LLMs don't understand DNS—they might suggest subdomains that sound plausible but don't exist, or worse, match wildcard patterns that would pollute your results with false positives.

The tradeoff for this intelligence is speed. SubGPT processes roughly 80 subdomains every 45 seconds—glacial compared to traditional tools that generate thousands per second. This isn't a technical limitation of the code itself, but an inherent constraint of using a conversational interface designed for humans. Each batch requires a full request-response cycle with Bing's servers, including natural language parsing and response generation. You're essentially asking Bing to think about your problem rather than executing a deterministic algorithm.

Gotcha

The elephant in the room is performance. At 45 seconds per 80 subdomains, SubGPT isn't viable for large-scale enumeration campaigns. If you're scanning hundreds of domains or need results quickly, the tool becomes a bottleneck. A traditional permutation generator can produce 10,000+ candidates in the time SubGPT generates 80. This makes it impractical for automated pipelines or time-sensitive engagements where you need comprehensive results within hours, not days.

The authentication model is another friction point that can break your workflow. SubGPT requires valid Bing cookies extracted from an authenticated browser session, stored in a cookies.json file. These cookies expire, and Bing enforces daily usage limits on their ChatGPT interface. Hit the limit, and you're dead in the water until the reset. There's no programmatic way around this—you're subject to the same constraints as a human user clicking through the web interface. For security teams running continuous reconnaissance or researchers working on multiple projects simultaneously, this manual cookie management becomes tedious. You can't easily parallelize across multiple domains or integrate it into fully automated recon frameworks without building cookie rotation infrastructure.

There's also a cold-start problem: SubGPT requires quality seed data to be effective. If you feed it three generic subdomains like 'www', 'mail', and 'ftp', the LLM has little context to infer organizational patterns. It might generate plausible-sounding variations, but they'll lack the domain-specific intelligence that makes the tool valuable. You need at least 10-20 diverse seeds that demonstrate actual naming conventions before the AI can extrapolate meaningfully. This makes SubGPT unsuitable as a first-stage reconnaissance tool—it's strictly for enrichment after you've already done substantial enumeration with traditional methods.

Verdict

Use SubGPT if: you're performing targeted reconnaissance on high-value domains where thoroughness matters more than speed, you've already exhausted traditional permutation tools and want to squeeze out edge-case discoveries, and you're comfortable with manual setup and cookie management for the potential payoff of finding non-obvious subdomains that could reveal interesting attack surface. It shines in bug bounty scenarios where discovering one unique subdomain can mean the difference between a duplicate report and a critical finding. Skip if: you need fast, automated, or large-scale enumeration across many domains, you're working with minimal seed data or just starting reconnaissance, or you require fully unattended workflows without manual intervention. For most use cases, run SubGPT as a final enrichment pass after traditional tools have done the heavy lifting, treating it as a creative supplement rather than a primary discovery method.

SubGPT: Using Bing's AI to Discover Subdomains That Pattern-Based Tools Miss

SubGPT: Using Bing's AI to Discover Subdomains That Pattern-Based Tools Miss

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

SubGPT: Using Bing's AI to Discover Subdomains That Pattern-Based Tools Miss

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]