Back to Articles

EyeWitness: How to Screenshot 10,000 Web Services Without Melting Your Laptop

[ View on GitHub ]

EyeWitness: How to Screenshot 10,000 Web Services Without Melting Your Laptop

Hook

If you've ever waited three hours for a reconnaissance tool to screenshot 5,000 URLs only to have it crash at 87% completion with no way to resume, you understand why EyeWitness exists.

Context

Traditional web reconnaissance follows a predictable pattern: run Nmap against a network range, get back hundreds or thousands of HTTP/HTTPS services, then manually browse through them looking for interesting applications. This approach doesn't scale. A penetration test against a large organization might discover 3,000+ web services across various ports and subdomains. Opening each one manually is impractical, and you'll miss patterns that only become visible when you can see everything at once.

Early solutions used tools like httpscreenshot or PhantomJS-based scripts, but these suffered from brittle installations, poor error handling, and no integration with existing scanning workflows. They'd crash halfway through large URL lists, forcing you to start over or manually diff what remained. EyeWitness emerged from Red Siege's penetration testing practice as a purpose-built solution for this exact scenario: consume the output of network scanners, reliably capture visual and header information for every web service discovered, identify low-hanging fruit like default credentials, and generate a navigable report that lets you quickly triage thousands of targets. The tool has evolved to handle modern Linux distribution challenges, particularly Python's PEP 668 externally-managed-environment restrictions that broke countless security tools on Kali Linux 2023 and later.

Technical Insight

Output

Worker Threads

URLs, Nmap XML, Nessus XML

Validated targets

URLs distributed

Selenium + Headless Chrome

Headers + Images

Query results

Input Sources

URL Normalizer

Task Queue

Worker Thread Pool

Screenshot Capture

SQLite Database

HTML Report Generator

System architecture — auto-generated

EyeWitness's architecture centers on three key components: input normalization, threaded screenshot workers, and SQLite-backed persistence. The input layer accepts text files with URLs, Nmap XML output, or Nessus XML files, parsing them into a normalized queue of targets. This integration point is crucial—it means you can chain nmap -oX scan.xml directly into ./EyeWitness.py -x scan.xml without writing glue scripts to extract URLs.

The screenshot engine uses Selenium WebDriver with headless Chromium, not the simpler but less reliable options like PhantomJS or webkit2png. This choice trades some startup overhead for vastly better JavaScript rendering and modern web standard support. The tool spawns worker threads (defaulting to 2x your CPU core count, capped at 20) that each manage their own WebDriver instance. Here's the core threading pattern:

# Simplified version of EyeWitness's worker implementation
import threading
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

def create_driver():
    chrome_options = Options()
    chrome_options.add_argument('--headless')
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--disable-dev-shm-usage')
    return webdriver.Chrome(options=chrome_options)

def worker(url_queue, results_db):
    driver = create_driver()
    while not url_queue.empty():
        try:
            url = url_queue.get(timeout=1)
            driver.set_page_load_timeout(10)
            driver.get(url)
            screenshot = driver.get_screenshot_as_png()
            headers = driver.execute_script(
                "return Object.fromEntries(performance.getEntries()[0].serverTiming || [])"
            )
            results_db.store(url, screenshot, headers)
        except Exception as e:
            results_db.store_error(url, str(e))
        finally:
            url_queue.task_done()
    driver.quit()

The adaptive resource management deserves attention. EyeWitness monitors memory usage and will throttle thread spawning if system memory crosses 85% utilization. This prevents the death spiral where spawning 20 Chrome instances exhausts RAM, triggers OOM killer, and destroys your entire scan. The SQLite database isn't just for final reporting—it's the resumption mechanism. Every completed URL writes to the database immediately, so if the scan crashes, interrupted, or you need to kill it, rerunning with the same output directory skips already-processed URLs.

The virtual environment isolation addresses a real problem that broke EyeWitness installations across Kali Linux updates. Modern Debian-based systems implement PEP 668, which prevents pip install into system Python to avoid conflicts with apt-managed packages. The setup script now creates an isolated venv:

# From setup.sh
python3 -m venv "${PWD}/Python"
source "${PWD}/Python/bin/activate"
pip install --upgrade pip
pip install -r requirements.txt

# Install Chromium and ChromeDriver into the venv's bin/
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
apt install ./google-chrome-stable_current_amd64.deb
wget https://chromedriver.storage.googleapis.com/$(curl -s https://chromedriver.storage.googleapis.com/LATEST_RELEASE)/chromedriver_linux64.zip
unzip chromedriver_linux64.zip -d "${PWD}/Python/bin/"

The default credential detection uses signature matching against common default credential patterns in page titles, body content, and HTML source. When it finds a title like "Apache Tomcat/7.0.70" combined with a login form, it attempts common credentials (admin/admin, tomcat/tomcat) and flags successful authentication. This is probabilistic and best-effort—it catches obvious cases but won't brute-force or use sophisticated detection.

Report generation outputs an HTML dashboard with thumbnail previews, allowing rapid visual scanning. The "Category" classification attempts to identify technology stacks (Jenkins, Tomcat, Citrix, etc.) by fingerprinting, which helps prioritize targets. If you're hunting for vulnerable Jenkins instances in a network of 2,000 web services, you can filter the report to just show Jenkins targets and ignore the hundreds of IIS default pages.

Gotcha

The Docker situation is legitimately frustrating. The repository has a Dockerfile that's marked experimental and frequently broken, which is surprising for a tool released in 2013 with this level of maturity. Containerization would solve the entire virtual environment activation workflow and make EyeWitness trivially portable across systems. As it stands, you need to remember to activate the venv every time: source Python/bin/activate && ./EyeWitness.py .... Forget this step and you'll get cryptic import errors. This friction makes it awkward to integrate into automated pipelines or cron jobs without wrapper scripts.

Performance on massive URL lists (10,000+) requires tuning. The default thread count works well for 500-2,000 URLs but isn't optimal for larger sets. You'll need to experiment with --threads and --timeout values. Setting timeouts too low causes false negatives on slow-loading sites; too high and a handful of hanging sites will bottleneck your entire scan. The tool doesn't implement adaptive timeout adjustment based on response patterns, so you're manually finding the sweet spot for your target environment. Memory consumption with 20 threads running headless Chrome instances can hit 8-10GB on large scans, which limits laptop-based reconnaissance—you'll want a dedicated VM or server for serious use.

Verdict

Use if: You're conducting penetration tests, bug bounties, or asset inventory against organizations with hundreds or thousands of web services and need visual reconnaissance integrated with Nmap/Nessus workflows. It's particularly valuable on modern Kali/Debian systems where the virtual environment isolation actually simplifies deployment compared to tools still fighting PEP 668. The resume capability makes it the best choice for unreliable network conditions or multi-day reconnaissance engagements. Skip if: You need real-time monitoring (it's batch-only), require Docker deployment for pipeline integration (Docker support is incomplete), are doing quick ad-hoc screenshots of fewer than 50 URLs (the venv activation overhead isn't worth it—use gowitness or aquatone instead), or need sophisticated credential brute-forcing beyond default credentials (use dedicated tools like Burp Intruder or Hydra after identifying login pages with EyeWitness).

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/ai-dev-tools/redsiege-eyewitness.svg)](https://starlog.is/api/badge-click/ai-dev-tools/redsiege-eyewitness)