WhiteChapel: Building a Centralized Password Cracking Pipeline with ElasticSearch and Redis

Hook

Password cracking at scale isn't about CPU power—it's about data management. When your red team cracks the same hash three times across different engagements because nobody searched the database first, you need WhiteChapel.

Context

Password cracking has always been a resource-intensive problem, but the real bottleneck for professional penetration testing teams isn't computational—it's organizational. A typical red team engagement generates millions of password hashes from Active Directory dumps, web application breaches, or wireless captures. Tools like hashcat and John the Ripper excel at the actual cracking, but they're single-purpose utilities that output flat files. When you're running multiple engagements simultaneously, or when different team members are working with different cracking tools, you end up with password databases scattered across laptops, shared drives, and Slack channels.

WhiteChapel emerged from this operational pain point: the need for a centralized repository where cracked passwords and their corresponding hashes could be stored, searched, and reused across engagements. Instead of re-cracking common hashes like the ubiquitous "Password1!" for the hundredth time, teams needed a system where querying existing results was instant, where importing massive wordlists didn't block the interface, and where horizontal scaling was built-in from day one. Rob Fuller (mubix) built WhiteChapel as a web-based framework that treats password auditing as a data pipeline problem, leveraging ElasticSearch for search performance and Redis for job orchestration.

Technical Insight

System architecture — auto-generated

WhiteChapel's architecture reveals deliberate choices for handling the unique performance characteristics of password auditing workflows. At its core, the application uses Sinatra—Ruby's lightweight web framework—to provide both a web UI and RESTful API endpoints. The storage layer relies on ElasticSearch rather than a traditional relational database, a critical decision that addresses the primary use case: fast hash lookups against billions of entries.

ElasticSearch excels here because password hash queries are exact-match searches, not complex joins or transactions. When you dump 50,000 NTLM hashes from a domain controller, you want to immediately know which ones are already in your database. ElasticSearch's inverted index structure makes these lookups nearly instantaneous, even across datasets containing billions of password-hash pairs accumulated over years of engagements. The framework also gains ElasticSearch's horizontal scaling capabilities for free—as your password database grows, you can add cluster nodes without application changes.

The asynchronous processing architecture is where WhiteChapel's design shines. Importing a 10GB wordlist through a web interface would normally timeout or block the application. Instead, WhiteChapel uses Resque (a Redis-backed job queue system) to handle imports as background workers:

class DictionaryImporter
  @queue = :dictionary_import
  
  def self.perform(file_path, metadata)
    File.foreach(file_path) do |line|
      password = line.chomp
      # Generate multiple hash types for each password
      hashes = {
        ntlm: Digest::MD4.hexdigest(password.encode('UTF-16LE')),
        md5: Digest::MD5.hexdigest(password),
        sha1: Digest::SHA1.hexdigest(password)
      }
      
      # Store in ElasticSearch with password as searchable field
      hashes.each do |type, hash|
        ElasticSearch.index(
          index: 'passwords',
          body: {
            hash: hash,
            hash_type: type,
            plaintext: password,
            source: metadata[:source],
            cracked_at: Time.now
          }
        )
      end
    end
  end
end

# Enqueue the job from the web controller
Resque.enqueue(DictionaryImporter, '/tmp/rockyou.txt', {source: 'rockyou'})

This queue-based approach provides several operational advantages. First, multiple workers can process different sections of a wordlist concurrently by splitting files or running multiple import jobs. Second, the web interface remains responsive—users can continue searching and uploading while background jobs process. Third, failed jobs can be retried without user intervention, and workers can be distributed across multiple machines by pointing them at the same Redis instance.

The CLI import tool bypasses HTTP overhead entirely for operational efficiency. Rather than chunking a multi-gigabyte file and POST-ing it through the web server, the CLI directly enqueues jobs in Redis:

# CLI bypasses web server completely
./whitechapel-import --file /data/breach_compilation.txt --workers 8

# Internally, this splits the file and creates worker jobs
# Each worker processes its chunk independently

The data model is intentionally denormalized for query performance. Each password-hash pair is stored as a separate ElasticSearch document rather than normalizing passwords into a separate table. This trades storage efficiency for query speed—when searching for a hash, you get the plaintext password in a single index lookup without joins. For a pentesting tool where disk is cheap but time is expensive, this is the right tradeoff.

WhiteChapel also exposes both web upload and API endpoints, enabling integration with cracking tools. The workflow becomes: run hashcat against your hash list, POST successful cracks to WhiteChapel via the API, and have those results immediately available to your entire team. The framework doesn't replace hashcat or John—it orchestrates the results from those tools into a searchable, persistent knowledge base.

Gotcha

The elephant in the room is that WhiteChapel appears abandoned, with dependencies dating back nearly a decade. The codebase references Twitter Bootstrap v2, Sinatra 1.x, and ElasticSearch client libraries that predate the major API changes in ElasticSearch 5.x+. Getting this running today would require significant dependency updates, and the ElasticSearch integration in particular would need a complete rewrite given how much that API has evolved. This isn't a "clone and deploy" situation—it's a reference architecture that needs modernization work.

The security model is essentially nonexistent. There's no mention of authentication, authorization, or multi-tenancy. WhiteChapel was explicitly designed for internal use by trusted teams, which means anyone with network access can view your entire password database and see which engagements cracked which passwords. This is acceptable for a segmented internal tool on a red team's private network, but the lack of even basic auth makes accidental exposure dangerous. The tool also lacks integration code for actually running hashcat or John—it's a result storage system, not an end-to-end cracking platform. You still need to build the glue code that takes cracking output and feeds it into WhiteChapel's API, which is non-trivial for teams expecting turnkey functionality.

Verdict

Use WhiteChapel if you're a penetration testing team or red team that needs to centralize password cracking results across multiple tools and engagements, you have Ruby development resources to modernize dependencies, and you operate in a trusted internal environment where the lack of access controls is acceptable. The architectural patterns around ElasticSearch for hash lookups and Redis queues for async processing remain sound and worth implementing even in a greenfield rebuild. Skip if you need a production-ready solution you can deploy immediately, require multi-tenant isolation or authentication (consulting firms with multiple clients), or want integrated cracking tool orchestration rather than just result storage—in those cases, look at CrackQ or Hashtopolis which provide active maintenance and broader feature sets. The real value of WhiteChapel today is as a reference architecture demonstrating how to solve the password auditing data pipeline problem, not as deployable software.

WhiteChapel: Building a Centralized Password Cracking Pipeline with ElasticSearch and Redis

WhiteChapel: Building a Centralized Password Cracking Pipeline with ElasticSearch and Redis

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

WhiteChapel: Building a Centralized Password Cracking Pipeline with ElasticSearch and Redis

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]