Kiterunner: The API Discovery Tool That Speaks Swagger

Hook

Traditional directory brute-forcers are asking "Does /api/users exist?" while modern APIs are answering "Only if you POST with a valid JSON body and Accept: application/json header." That's the reconnaissance gap Kiterunner was built to close.

Context

For years, security researchers and penetration testers relied on tools like DirBuster, Gobuster, and ffuf to discover hidden web content. The playbook was simple: throw thousands of path names at a target and see what returns a 200 status code. This worked beautifully for the PHP and static HTML era, where /admin.php either existed or it didn't.

But modern web applications don't work that way. REST APIs built with Express, Flask, Django REST Framework, or Spring Boot have explicit route definitions that require specific HTTP methods, headers, content types, and parameter structures. A request to GET /api/v2/users might return 404, while GET /api/v2/users?id=1 returns data. Traditional tools would miss this entirely because they're path-aware but not route-aware. Kiterunner emerged from Assetnote's need to discover API endpoints during attack surface assessments, recognizing that the wealth of publicly available Swagger/OpenAPI specifications on GitHub and across the internet could be weaponized for reconnaissance. Instead of guessing paths, why not learn from thousands of real API implementations?

Technical Insight

Kiterunner's architecture separates data compilation from scanning execution, a design choice that trades upfront processing for runtime performance. The tool introduces a proprietary .kite format—essentially a compiled, indexed representation of API routes extracted from Swagger specifications. When you download the routes-large.kite wordlist, you're getting the distilled patterns from internet-wide scans, GitHub's BigQuery dataset of public repositories, and curated sources like APIs.guru.

The compilation process transforms verbose JSON/YAML Swagger specs into a compact binary format that includes HTTP methods, route templates with parameter placeholders, common headers, and example values. During scanning, Kiterunner doesn't just request /api/users—it requests GET /api/users, POST /api/users, GET /api/users/{id}, and DELETE /api/users/{id} with appropriate headers and body content.

Here's what a basic scan looks like:

# Traditional API-aware scan
kr scan https://example.com -w routes-large.kite

# Scan with specific wordlist and output
kr scan https://api.target.com -w routes-small.kite -o json \
  --fail-status-codes 404,403 \
  --max-parallel-hosts 5

# Brute-force mode with custom headers
kr brute https://example.com/api -w routes.kite \
  -H "Authorization: Bearer token123" \
  -x 20 --delay 100

The tool implements sophisticated wildcard detection to avoid false positives. Many modern web frameworks return 200 status codes with a generic "not found" page or redirect everything to a SPA (Single Page Application) entry point. Kiterunner sends probe requests at the beginning of a scan to establish a baseline response profile. It measures response lengths, status codes, and content patterns, then automatically quarantines hosts that appear to accept everything. This prevents your scan results from being polluted with thousands of meaningless hits.

One of Kiterunner's most powerful features is its replay capability. When you discover an interesting endpoint, you can use the kb replay command to resend the request with the exact context that triggered the discovery:

# Replay a discovered route from scan results
kr kb replay "GET /api/v2/users/{id}" https://example.com \
  --kitebuilder-full-scan

The scanning engine parallelizes at both the host and route level. You can scan multiple targets simultaneously while each target is being probed with multiple routes concurrently. This is controlled through the --max-parallel-hosts and -x (parallelism per host) flags. The Go concurrency model makes this efficient—goroutines handle each request independently while channel-based coordination prevents race conditions.

Under the hood, Kiterunner uses a custom HTTP client with connection pooling and keepalive optimization. Unlike tools that create a new TCP connection for every request, Kiterunner reuses connections when scanning the same host, significantly reducing overhead. The codebase also implements smart retry logic with exponential backoff for transient failures, and respects rate limiting through configurable delay and throttle parameters.

The .kite format itself is worth understanding. While the specification isn't publicly documented, reverse engineering reveals it's essentially a serialized trie (prefix tree) structure optimized for route lookups. Routes with common prefixes share storage, and parameter placeholders are tokenized for efficient substitution. This explains why a 2.6GB wordlist can contain hundreds of thousands of route variations—it's storing patterns and templates, not just flat strings.

Gotcha

The biggest friction point with Kiterunner is wordlist management. The proprietary .kite format means you can't just download a text file from SecLists and start scanning. If you want to compile custom routes—say, from your organization's internal Swagger specs or from a specific framework you're researching—you need to use the kr kb compile command to convert JSON/YAML into the binary format. The compilation process isn't well-documented, and the schema for input files is learned primarily through example. This creates a barrier for teams that want to maintain living, frequently-updated wordlists as part of their CI/CD pipeline.

The tool's resource requirements can also surprise you. The routes-large.kite file is 2.6GB decompressed, and loading it into memory for scanning can consume significant RAM, especially when running parallel scans against multiple targets. On memory-constrained VPS instances or when running Kiterunner inside containers with resource limits, you might need to use smaller wordlists or adjust parallelism settings to avoid OOM errors. Additionally, while Kiterunner excels at discovering endpoints that match known patterns, it's fundamentally limited by its training data. If a target uses completely novel API conventions or custom routing patterns not represented in public Swagger specs, the tool will miss them. It's pattern-matching, not truly intelligent discovery—there's no machine learning or adaptive probing that learns from the target's responses to predict new routes.

Verdict

Use if: You're conducting reconnaissance against modern web applications with REST APIs, especially when targeting known frameworks like Express, Flask, Rails, Django REST, or Spring Boot. Kiterunner shines when you need to discover endpoints on attack surfaces where traditional directory brute-forcing produces nothing useful, and when you're working with targets that require proper HTTP method and header combinations to respond meaningfully. It's particularly valuable for bug bounty hunters and penetration testers who scan diverse targets where API route conventions aren't immediately obvious. Skip if: You're working with simple static sites or traditional LAMP stack applications where path-based tools like ffuf or feroxbuster are sufficient and faster. Also skip if you need real-time wordlist customization without pre-compilation overhead, have severe storage or bandwidth constraints that make multi-gigabyte wordlists impractical, or require features beyond discovery (like automated exploit chaining or vulnerability detection). If you're scanning targets with truly custom or proprietary API patterns that don't match public Swagger conventions, consider complementing Kiterunner with manual reconnaissance or parameter discovery tools like Arjun.

Kiterunner: The API Discovery Tool That Speaks Swagger

Kiterunner: The API Discovery Tool That Speaks Swagger

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Kiterunner: The API Discovery Tool That Speaks Swagger

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Headroom: The Three-Layer Compression Stack That Makes LLM Context Windows 60% Cheaper

GSD Core: Why This Tool Spawns a Fresh AI Context for Every Coding Task

Chipotlai Max: Reverse-Engineering Corporate Chatbots for Free LLM Inference

Running Gemma-4 26B on DGX Spark: Why Speculative Decoding Falls Apart at Scale

Headroom: The Three-Layer Compression Stack That Makes LLM Context Windows 60% Cheaper

GSD Core: Why This Tool Spawns a Fresh AI Context for Every Coding Task

Chipotlai Max: Reverse-Engineering Corporate Chatbots for Free LLM Inference

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]