GitTools: Reconstructing Source Code from Misconfigured Web Servers
Hook
In 2015, security researchers scanned the Alexa top 1 million websites and found that thousands had their entire source code—including credentials and API keys—accessible to anyone who knew where to look. The smoking gun? Misconfigured .git directories.
Context
When developers deploy web applications, they typically clone their Git repository to the production server and build from there. It's convenient, but it introduces a critical vulnerability: if the web server is misconfigured to serve the .git directory (or if developers forget to exclude it), attackers can reconstruct the entire codebase without ever compromising the server itself.
GitTools emerged from penetration testing research to exploit exactly this misconfiguration. Unlike traditional attacks that require finding vulnerabilities in application logic, this technique leverages Git's own internal structure against itself. The repository contains three complementary tools: Finder scans domains for exposed .git directories, Dumper downloads repository contents even without directory listings enabled, and Extractor rebuilds commit history from incomplete or corrupted dumps. Together, they form a pipeline for recovering source code that was never meant to be public—and potentially discovering hardcoded secrets, proprietary algorithms, and security vulnerabilities within.
Technical Insight
The elegance of GitTools lies in how it exploits Git's predictable internal structure. Git stores objects in .git/objects/ using content-addressable storage—each object's filename is the SHA-1 hash of its contents. This deterministic naming scheme becomes a liability when exposed to the web.
The Dumper tool demonstrates this exploitation perfectly. It starts by downloading .git/HEAD to determine the current branch, then recursively traverses references and objects:
# Simplified example of Dumper's core logic
curl -s http://target.com/.git/HEAD
# Returns: ref: refs/heads/master
curl -s http://target.com/.git/refs/heads/master
# Returns: 3c6e0b8a9c15224a8228b9a98ca1531d8b48e136
# Download the commit object
curl -s http://target.com/.git/objects/3c/6e0b8a9c15224a8228b9a98ca1531d8b48e136 \
--output .git/objects/3c/6e0b8a9c15224a8228b9a98ca1531d8b48e136
# Parse the commit object to find tree and parent commits
git cat-file -p 3c6e0b8a9c15224a8228b9a98ca1531d8b48e136
# Returns tree hash, parent hash, author, message...
This recursive process continues, downloading each referenced object until the entire repository structure is reconstructed. Crucially, Dumper doesn't rely on directory listings (which are often disabled on production servers). It only needs the ability to request specific files by their SHA-1 hash—something Git's architecture guarantees will exist.
The Extractor tool handles cases where Dumper's recovery is incomplete. It operates at an even lower level, directly parsing Git's object database:
# Extractor iterates through all objects in .git/objects/
for dir in .git/objects/??/; do
for obj in "$dir"*; do
hash="${dir: -3:2}${obj##*/}"
type=$(git cat-file -t "$hash" 2>/dev/null)
if [ "$type" = "commit" ]; then
# Extract this commit to a separate directory
mkdir -p "output/$hash"
git checkout "$hash" -- .
cp -r . "output/$hash/"
fi
done
done
This approach creates a directory for each commit hash, containing the complete file tree at that point in history. While you lose chronological ordering (commits aren't sorted by date), you gain access to every recoverable version of the codebase—perfect for finding secrets that were committed and later removed.
The Finder tool completes the trinity by automating discovery. It performs mass scanning by checking for .git/HEAD across target domains, optionally reading from lists or IP ranges. Its simplicity is deceptive—it's essentially a specialized web crawler that looks for one specific fingerprint:
curl -s -f http://target.com/.git/HEAD | grep -q "ref:"
if [ $? -eq 0 ]; then
echo "[+] Found: http://target.com/.git/"
fi
The architecture decision to keep these as separate shell scripts rather than a unified tool is intentional. Penetration testers often need just one piece: Finder for reconnaissance, Dumper when you've already identified a target, or Extractor when working with a partially corrupted dump from another source. The Unix philosophy of composable, single-purpose tools shines here.
One sophisticated detail in Dumper's implementation is how it handles Git's loose vs. packed objects. Git periodically optimizes storage by compressing multiple objects into pack files. Dumper attempts to download both loose objects and pack files (.git/objects/pack/pack-*.pack), though the latter requires parsing Git's binary pack format—a point where many dumps become incomplete. This is why Extractor exists: it salvages whatever objects were successfully recovered, even if the repository as a whole is corrupted.
Gotcha
GitTools has several practical limitations that aren't immediately obvious. The most significant is incomplete repository recovery. When Git runs garbage collection, it consolidates loose objects into pack files—compressed, delta-encoded archives that require downloading the entire pack file and parsing its binary format. Dumper has basic pack file support, but if the pack file is large or the connection is unstable, you'll end up with an incomplete dump. Worse, there's no resumption capability—interrupt the download and you start from scratch.
The Extractor's lack of chronological sorting creates real usability problems. When you extract commits, you get directories named by SHA-1 hash with no indication of when each commit occurred or how they relate to each other. If you're hunting for when a particular vulnerability was introduced, you'll need to manually inspect commit objects and rebuild the timeline yourself. Modern alternatives like git-dumper have addressed this by parsing commit metadata and sorting appropriately. Additionally, GitTools has no stealth features whatsoever—rapid-fire requests for Git objects are trivially detectable by any decent intrusion detection system, making it unsuitable for red team operations that require operational security.
Verdict
Use if: You're conducting authorized penetration testing or bug bounty hunting and need a lightweight, dependency-free tool to exploit exposed .git directories. GitTools excels at quick assessments during security audits where you've identified a misconfigured server and need to determine the scope of exposure. It's also valuable for educational purposes—reading the shell scripts teaches you exactly how Git's internal object model works. Skip if: You need guaranteed complete repository reconstruction (git-dumper handles pack files better), you're conducting large-scale reconnaissance that requires stealth (GitTools is extremely noisy), you need to scan for multiple version control systems beyond Git (dvcs-ripper supports SVN, Mercurial, etc.), or—most importantly—you don't have explicit authorization to test the target system. Unauthorized use is illegal in most jurisdictions and ethically indefensible.