Teaching Claude to Hack: A Knowledge Base of 88,636 Real-World Security Breaches
Hook
What if you could give an AI assistant the collective memory of nearly 90,000 real security breaches from China’s most influential vulnerability platform? That’s exactly what WooYun Legacy does—and it raises fascinating questions about how we augment LLMs with domain expertise.
Context
Between 2010 and 2016, WooYun operated as China’s premier security vulnerability disclosure platform, amassing detailed case studies of real-world breaches across telecommunications, banking, e-commerce, and critical infrastructure. When it shut down in 2016, the security community lost access to an irreplaceable corpus of exploitation techniques, vulnerability patterns, and security testing methodologies derived from actual penetration tests. This wasn’t theoretical OWASP guidance—these were detailed postmortems of successful attacks against production systems.
WooYun Legacy preserves this knowledge as a Claude Code Skill, packaging the entire archive into an 86MB structured knowledge base. Rather than building a traditional search interface or database, the project takes a novel approach: it formats the data as context material that Claude can reference when analyzing security questions. This transforms an LLM into a security consultant with instant access to historical vulnerability patterns, complete with exploitation chains, bypass techniques, and industry-specific attack methodologies. It’s prompt engineering meets security archaeology.
Technical Insight
The architecture is deceptively simple but cleverly designed for LLM consumption. The repository organizes 88,636 vulnerability cases into three hierarchical layers. The knowledge/ directory contains methodology documents that define security testing frameworks—these are meta-level guides that teach Claude how to approach security analysis. The categories/ directory breaks down actual vulnerability cases by type: SQL injection (27% of cases), command execution (19%), XSS (11%), and twelve other categories. Finally, examples/ provides industry-specific penetration testing playbooks for sectors like telecommunications and banking.
What makes this interesting from an engineering perspective is how it structures knowledge for AI retrieval. Instead of relying on semantic search or vector databases, it uses Claude’s native context window. Here’s how a typical security analysis might reference the knowledge base:
# Conceptual example of how Claude processes a query with WooYun context
query = "How would you test a Chinese banking app for SQL injection?"
# Claude internally accesses:
# 1. knowledge/sql-injection-methodology.md (testing framework)
# 2. categories/sql-injection/*.md (24,000+ real cases)
# 3. examples/banking-pentest.md (sector-specific attack surfaces)
response = """
Based on 3,847 banking SQL injection cases in the knowledge base:
1. Focus on these high-value endpoints (from real breaches):
- Account balance queries: /api/balance?userid=
- Transaction history: /txn/list?account=&startdate=
- Fund transfer verification: /transfer/verify?from=&to=
2. Test these Chinese banking-specific bypass patterns:
- GBK encoding exploitation: ?id=1%df' union select...
- State-owned bank WAF bypass: Use /*! MySQL comment syntax */
- Mobile banking API: Check for unvalidated JSON parameters
3. Historical success patterns show:
- 67% of cases exploited transaction logging queries
- 23% used second-order injection in fund transfer flows
- WAF bypass rate increased 3x using chunked encoding
"""
The genius is in the preprocessing. Each vulnerability case follows a standardized structure: affected system details, vulnerability type classification, exploitation steps with actual payloads, and remediation recommendations. This consistency lets Claude pattern-match across thousands of cases to identify commonalities. When you ask about SQL injection in e-commerce systems, it can synthesize insights from hundreds of similar historical breaches.
The industry-specific playbooks are particularly sophisticated. The telecommunications penetration testing guide, for example, doesn’t just list vulnerabilities—it maps complete attack chains observed in real breaches. It might describe how attackers progressed from an exposed SMS gateway API to internal billing systems, then to subscriber databases, with actual parameter names and exploitation sequences documented from genuine incidents. This is operational security intelligence, not generic testing checklists.
One interesting technical decision is the use of markdown for all knowledge files. This makes the content easily parsable by Claude while remaining human-readable for auditing. Each category directory contains hundreds of individual markdown files rather than one massive database dump, which improves Claude’s ability to retrieve relevant context without overwhelming the token budget.
The skill also includes vulnerability-specific parameter dictionaries—massive lists of common parameter names observed in real attacks. For SQL injection, you’ll find collections like id, userid, page, cat, itemid extracted from actual successful exploits. For XSS, there are exhaustive lists of input fields that historically lacked sanitization. These aren’t theoretical attack surfaces; they’re empirical data about where real systems actually failed.
Gotcha
The elephant in the room is data freshness. This is a 2010-2016 snapshot of security vulnerabilities, frozen in time like a cybersecurity fossil record. Modern security controls have evolved significantly—what worked against a 2013 PHP application may fail spectacularly against a 2024 Next.js app behind Cloudflare WAF. The knowledge base has zero coverage of cloud-native architectures, containerization security, serverless vulnerabilities, or any CVEs from the past eight years. If you’re testing a Kubernetes cluster or hunting for SSRF in AWS metadata endpoints, this historical context won’t help much.
There’s also a significant cultural and linguistic context barrier. The vulnerability cases reference specific Chinese companies, government systems, and infrastructure that Western developers may never encounter. Banking examples assume knowledge of Chinese financial systems, telecom cases reference China Mobile/China Unicom architectures, and exploitation techniques sometimes rely on GBK encoding quirks irrelevant outside Chinese-language applications. The use case is narrow: authorized security testing of Chinese technology stacks or historical security research.
Finally, there’s the ethical dimension that can’t be ignored. This repository contains detailed, step-by-step exploitation guides for critical infrastructure sectors. While the original WooYun platform operated as responsible disclosure, packaging this knowledge for an AI assistant creates new risks. There’s no authentication layer, no verification that users have authorization to perform security testing, no logging of how the knowledge is used. It’s a powerful tool that assumes responsible use—a significant limitation in itself.
Verdict
Use if: You’re conducting authorized penetration testing of Chinese infrastructure, researching historical vulnerability patterns in specific sectors, or training security teams using real-world case studies as learning material. The knowledge base excels at teaching security thinking patterns through empirical breach data and provides unparalleled insight into mid-2010s Chinese cybersecurity landscapes. Skip if: You need current vulnerability intelligence for modern cloud-native applications, work primarily outside Chinese technology ecosystems, or lack clear legal authorization for security testing. The outdated timeframe and geographic specificity make this a specialized research tool rather than a general-purpose security assistant. For contemporary security work, combine it with current CVE databases and modern penetration testing frameworks like OWASP or MITRE ATT&CK.