CewlAI: When Security Reconnaissance Meets Large Language Models

Hook

What if you could teach an AI to think like a DevOps engineer naming subdomains? CewlAI does exactly that, turning pattern recognition into a reconnaissance weapon.

Context

Security researchers and bug bounty hunters face a fundamental problem: discovering all the domains and subdomains associated with a target organization. Traditional tools like subfinder and amass excel at finding publicly visible domains through certificate transparency logs, DNS records, and search engines. But they can't predict domains that haven't been indexed yet or subdomains following internal naming conventions.

Rule-based tools like altdns and dnsgen attempt to bridge this gap by applying mutation rules—adding prefixes, suffixes, and common patterns to known domains. Feed them 'api.example.com' and they'll generate 'api-dev.example.com', 'api-staging.example.com', etc. But these tools are only as creative as their wordlists and hardcoded rules. They can't recognize that a company naming their services 'artemis-api.example.com' and 'apollo-gateway.example.com' might follow a Greek mythology pattern. That's where CewlAI enters: it uses large language models to identify naming patterns humans might miss and generate variations that rule-based tools would never consider.

Technical Insight

CewlAI's architecture is deliberately simple—a Python CLI tool that acts as an intelligent bridge between your domain lists and various LLM APIs. The core workflow accepts domains through multiple input methods (command-line arguments, file paths, or stdin), sends them to your chosen LLM with pattern-recognition prompts, and outputs deduplicated results.

The tool supports four LLM backends through a unified interface: Google's Gemini, OpenAI's GPT models, WhiteRabbitNeo via Kindo, and local models through Ollama. Here's how you'd use it in a typical reconnaissance workflow:

# Basic usage with Gemini (default)
echo "api.example.com\ndev.example.com\nstaging.example.com" | cewlai

# Use local Ollama model for sensitive targets
cewlai --ollama --model llama2 -f known_domains.txt -o generated.txt

# Iterative generation with 3 passes
cewlai --openai --loop 3 api.target.com internal.target.com

# Chain with other tools
subfinder -d example.com | cewlai --gemini | httpx -silent

The token management implementation deserves attention. Before sending requests, CewlAI estimates token count using a simple character-based heuristic (roughly 4 characters per token). When your input exceeds 100,000 tokens, it automatically truncates the list and warns you. This prevents runaway API costs but also means you lose data with massive inputs:

# Simplified token estimation logic
def estimate_tokens(domains):
    total_chars = sum(len(d) for d in domains)
    return total_chars // 4

if estimate_tokens(domains) > 100000:
    # Truncate and warn
    domains = truncate_to_limit(domains, 100000)
    print("⚠️  Input truncated to 100k tokens")

The prompting strategy is where CewlAI's intelligence lives. Rather than asking the LLM to simply generate random domains, it instructs the model to analyze patterns in the provided seeds. The prompt essentially says: 'Here are real domains from a target—identify naming conventions, patterns, abbreviation styles, and generate plausible variations.' This prompt engineering turns a general-purpose LLM into a specialized reconnaissance tool.

The iterative loop feature (--loop N) implements a feedback mechanism where each generation's output becomes the next generation's input. This can surface deeper patterns:

# First pass might recognize "dev" and "staging" patterns
# Second pass might combine those with discovered service names
# Third pass might apply the patterns to new contexts
cewlai --loop 3 -f seeds.txt

Deduplication happens at multiple levels. Within each generation, CewlAI filters the LLM's output to remove duplicates. Across loop iterations, it maintains a set of all previously generated domains to prevent the AI from suggesting the same variations repeatedly. This is critical because LLMs often generate similar outputs when given similar prompts.

The Unix philosophy integration is intentional. By supporting stdin and writing to stdout by default, CewlAI slots naturally into security tool chains. You can pipe results from subdomain enumeration tools into CewlAI for expansion, then pipe CewlAI's output into HTTP probing tools or DNS resolution checkers. This composability makes it far more useful than a standalone GUI application would be.

Gotcha

CewlAI's biggest limitation is that it generates hypothetical domains without any validation. The LLM might produce 'quantum-api.example.com' because it noticed other science-themed names, but that domain might not exist. You're trading precision for recall—casting a wider net with the expectation that most catches will be empty. You must pipe outputs through DNS resolution tools or HTTP probers to filter viable targets, adding complexity to your workflow.

API costs are a genuine concern. A single iteration with 100k tokens can cost $2-10 depending on your model choice (GPT-4 vs Gemini Flash vs others). Run three loops on five different target domains per day and you're looking at hundreds of dollars monthly. The tool provides no cost estimation, rate limiting, or budget controls. The Ollama backend mitigates this by running models locally, but at the cost of reduced quality—smaller open-source models don't match GPT-4's pattern recognition capabilities. There's also zero configurability for LLM parameters. You can't adjust temperature to make outputs more creative or deterministic, can't customize the system prompt for specific reconnaissance scenarios, and can't control token limits per request. The tool makes opinionated choices that work for general cases but might frustrate users with specific needs.

Verdict

Use if: You're conducting security reconnaissance on targets with substantial existing domain data (50+ known subdomains), have API budget to spare or can run quality local models, and need creative pattern discovery beyond what wordlist-based tools provide. CewlAI shines when targets use non-obvious naming conventions—mythology references, internal project codenames, or multilingual patterns that traditional tools miss. It's particularly valuable in bug bounty hunting where discovering one unmonitored subdomain can mean the difference between a bounty and nothing. Skip if: You're working with limited seed data (fewer than 10-20 domains—the AI needs patterns to learn from), require validated results without additional tooling, have strict budget constraints without local LLM infrastructure, or need guaranteed reproducibility (LLM outputs vary between runs). For most reconnaissance workflows, use CewlAI as a supplementary creative tool alongside traditional enumeration, not as a replacement. Run subfinder and amass first to build a seed list, expand it with CewlAI, then validate results with dnsx or httpx.

CewlAI: When Security Reconnaissance Meets Large Language Models

CewlAI: When Security Reconnaissance Meets Large Language Models

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

CewlAI: When Security Reconnaissance Meets Large Language Models

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]