Exploiting Exposed Version Control Systems with dvcs-ripper
Hook
Thousands of production web servers accidentally expose their entire Git history—including deleted secrets, API keys, and authentication tokens—simply by failing to block access to a single directory.
Context
Web servers routinely serve application files from directories that also contain version control metadata. A developer might deploy code by cloning a Git repository directly on the server, resulting in a .git directory sitting alongside index.html and other public files. Even when directory listing is disabled, the predictable structure of VCS metadata means attackers can reconstruct entire repositories by requesting specific files.
The .git directory contains objects, refs, packed files, and an index—all with deterministic names and paths. If a web server returns these files instead of blocking them, anyone can download your complete source history. dvcs-ripper automates this extraction process across multiple version control systems, turning a common misconfiguration into a complete source code disclosure. Written in Perl by kost, this tool can rip repositories even when directory browsing is completely disabled, as documented in its README.
Technical Insight
dvcs-ripper exploits the deterministic nature of VCS metadata structures. Git repositories create predictable directory hierarchies, and the tool requests files it knows should exist without requiring directory listings.
For Git repositories, rip-git.pl retrieves metadata and objects, then reconstructs the working directory. Here’s basic usage:
# Basic Git repository extraction
rip-git.pl -v -u http://www.example.com/.git/
# Ignore SSL certificate verification
rip-git.pl -s -v -u https://www.example.com/.git/
# Download to specific directory with auto-created subdirectory
rip-git.pl -m -o /output/dir -v -u http://www.example.com/.git/
After downloading metadata, the tool automatically executes git checkout -f to reconstruct the working directory from downloaded objects. The -v flag provides verbose output showing retrieval progress.
The SVN implementation includes automatic format detection. Subversion changed its working copy format between versions—older formats stored .svn directories in every subdirectory, while newer versions use a single .svn directory with a SQLite database (wc.db). The tool detects which format is in use and adjusts extraction accordingly:
# SVN extraction (auto-detects format)
rip-svn.pl -v -u http://www.example.com/.svn/
Once extraction completes, it runs svn revert -R . to restore the working copy. For newer SVN formats, this requires Perl’s DBD::SQLite module to parse the wc.db file.
For large repositories, the tool supports Redis-backed job queuing for distributed parallel downloads. Multiple clients can work simultaneously, pulling tasks from a shared Redis instance:
# Create Redis container for job coordination
docker run --name redisdvcs -it -v /data:/data:rw k0st/alpine-redis
# Client 1: -n prevents checkout, -e specifies Redis endpoint, -t sets thread count
docker run -it -v /work:/work:rw k0st/alpine-dvcs-ripper \
rip-git.pl -n -e global.docker.ip -v \
-u http://www.example.org/.git -t 10 -c -m -o /work
# Client 2: Same command on different machine sharing NFS mount
docker run -it -v /work:/work:rw k0st/alpine-dvcs-ripper \
rip-git.pl -n -e global.docker.ip -v \
-u http://www.example.org/.git -t 10 -c -m -o /work
This architecture appears to allow scaling across multiple machines. The -c flag is used in conjunction with Redis mode in the documented examples.
The tool delegates final reconstruction to native VCS clients rather than implementing full object parsing in Perl. This means you must have git, svn, hg, or bzr installed depending on what you’re extracting.
Mercurial and Bazaar support follows similar patterns—fetch metadata files, download objects, then invoke the native client:
# Mercurial extraction
rip-hg.pl -v -u http://www.example.com/.hg/
# Bazaar extraction
rip-bzr.pl -v -u http://www.example.com/.bzr/
The Docker integration provides a reproducible environment. The k0st/alpine-dvcs-ripper image includes necessary dependencies, eliminating manual installation on various platforms.
Gotcha
The README explicitly lists a TODO item: ‘Recognize 404 pages which return 200 in SVN/CVS’. This indicates the tool struggles with servers that return HTTP 200 status codes for non-existent files instead of proper 404 responses. While the README mentions this issue is recognized for Git, it remains unresolved for SVN and CVS. If you encounter a misconfigured server returning fake 200s, you may get incomplete downloads or false positives as the tool treats error pages as valid metadata files.
The tool also lacks progress indicators, explicitly listed as a TODO item. For large repositories with thousands of objects, you watch verbose output without knowing completion percentage, making it difficult to estimate time remaining or decide whether to abort.
Dependency on native VCS clients means this isn’t standalone. You must install git, svn, hg, or bzr depending on what you’re ripping. On minimal Docker containers or restricted environments, this adds complexity. The tool also won’t automatically checkout the working directory when using Redis-based distributed mode (the -n flag prevents it), requiring manual git checkout -f after distributed extraction completes. This is documented in the README but easy to forget, leaving you with just the .git directory and no working files.
Verdict
Use dvcs-ripper if you’re conducting penetration tests or security audits and need to demonstrate the risk of exposed VCS directories. The tool supports Git, SVN, Mercurial, and Bazaar extraction, with advanced features like parallel processing and Redis-based distributed downloads for Git repositories. The Docker integration makes it straightforward to use for assessments without installing dependencies system-wide. Skip it if you need fully automated exploitation that handles edge cases like misleading HTTP responses, or if progress tracking is essential for your workflow. Skip it entirely if you’re on the defensive side—focus instead on proper web server configuration to block VCS directory access and use ignore rules to prevent committing secrets. This is a specialized offensive security tool, not for repository management or general source control workflows.