Back to Articles

Deobfuscating JavaScript with LLMs: How Humanify Separates Structural and Semantic Transformations

[ View on GitHub ]

Deobfuscating JavaScript with LLMs: How Humanify Separates Structural and Semantic Transformations

Hook

A minified JavaScript file can cost you $0.50 in API fees to deobfuscate—but that might be cheaper than the hours you’d spend manually renaming thousands of variables like ‘a’, ‘b’, and ‘zA3$’.

Context

Every developer has encountered obfuscated JavaScript code. Whether you’re debugging a production build without source maps, analyzing a third-party library, or reverse-engineering legacy code, minified JavaScript is a wall between you and understanding. Traditional prettifiers like js-beautify can restore indentation and whitespace, but they leave you staring at functions named ‘e’, ‘t’, and ‘n’—syntactically valid but semantically opaque.

The challenge isn’t just readability; it’s comprehension. You need to understand what ‘function r(e,t)’ actually does, and renaming it to something meaningful requires reading through the implementation, inferring intent, and manually refactoring. For a large bundle, this is hours or days of cognitive load. LLMs promised to automate this semantic understanding, but early attempts either hallucinated code changes (breaking functionality) or were tightly coupled to a single API provider. Humanify emerged as a production-ready solution that separates concerns: AST transformations handle structural changes with guaranteed correctness, while LLMs provide intelligent suggestions for the only truly subjective part—naming.

Technical Insight

Phase 2: AI-Powered

Phase 1: Deterministic

Unpacked Bundle

Unminify & Unwrap

OpenAI API

Gemini API

Local

Rename Suggestions

Rename Suggestions

Rename Suggestions

Obfuscated JS Input

Webcrack Unpacker

Babel AST Transformer

Structurally Clean Code

LLM Provider Selection

ChatGPT

Gemini

Llama Model

AST Renaming

Human-Readable Output

System architecture — auto-generated

Humanify’s architecture is built on a critical insight: not all deobfuscation tasks require AI. The tool uses a two-phase pipeline where Babel handles deterministic transformations at the Abstract Syntax Tree level, then LLMs tackle the non-deterministic problem of semantic naming.

In the first phase, Babel performs structural transformations that are provably correct. This includes unminifying code (expanding single-line statements), unwrapping expressions (extracting nested function calls into intermediate variables), and integrating Webcrack to unpack webpack bundles. These transformations maintain 1-1 semantic equivalence—the code behaves identically before and after. Here’s what that looks like in practice:

// Before: Minified webpack bundle
function(e,t,n){var r=n(3),o=n(7);e.exports=function(e){return r(o(e))}}

// After Babel transformations (before LLM)
function(module, exports, require) {
  var dependency1 = require(3);
  var dependency2 = require(7);
  module.exports = function(input) {
    var processed = dependency2(input);
    return dependency1(processed);
  }
}

Notice that Babel hasn’t guessed at semantics—it’s simply unwrapped the nested calls and expanded the structure. Variables are still generic (‘dependency1’, ‘input’), but the code is now parseable.

The second phase sends this structurally-clean code to an LLM with carefully crafted prompts. The LLM analyzes usage patterns, control flow, and context to suggest meaningful names. Critically, the LLM doesn’t rewrite code—it only provides a renaming map. Humanify validates these suggestions and applies them via AST manipulation, ensuring no logic changes:

// After LLM renaming suggestions applied
function(module, exports, require) {
  var validateInput = require(3);
  var sanitizeString = require(7);
  module.exports = function(userInput) {
    var sanitized = sanitizeString(userInput);
    return validateInput(sanitized);
  }
}

Humanify supports three LLM backends: OpenAI’s GPT models via API, Google’s Gemini, and local inference using llama.cpp with downloadable models. The local mode is particularly interesting for large codebases. While less accurate than cloud APIs, it eliminates per-token costs and keeps proprietary code off external servers. The tool includes native Apple Silicon GPU support via llama.cpp, making local inference viable on modern Macs.

The prompt engineering is worth examining. Humanify doesn’t just dump code into the LLM—it provides structured context about variable usage, scope, and relationships. For each identifier, it extracts where it’s defined, how it’s called, what it returns, and what operations it performs. This contextual information dramatically improves naming quality compared to naive prompting.

Version 2 of Humanify removed the Python dependency that plagued earlier versions, moving to a pure TypeScript implementation. This eliminated installation friction and made the tool truly cross-platform. The codebase now includes comprehensive tests that verify both the Babel transformations and the LLM integration layer, catching regressions before they ship.

The token cost model is transparent: approximately 2 tokens per input character. For bootstrap.min.js (roughly 120KB), that’s about 240,000 tokens or ~$0.50 with GPT-3.5. Humanify batches requests intelligently and caches results, but for truly massive bundles (multi-megabyte vendor files), local mode becomes economically necessary.

Gotcha

The biggest limitation is cost predictability with cloud APIs. While $0.50 for a medium file sounds reasonable, a sprawling vendor bundle can easily consume $5-10 in API credits. There’s no way to cap spending mid-run, so you might discover the cost after processing completes. Local mode solves this but introduces accuracy trade-offs—expect more generic names like ‘data’ or ‘result’ instead of the nuanced suggestions cloud models provide.

Performance on CPU-only systems in local mode is genuinely painful. Without GPU acceleration, processing even modest files can take minutes. The documentation warns about this, but the reality is harsher than expected—a 50KB file might take 5-10 minutes on an older Intel laptop. If you’re on hardware without Metal (macOS) or CUDA support, budget significant time or stick to cloud APIs. Additionally, LLMs are non-deterministic by nature. Running the same file twice can produce different variable names. For some use cases (one-time reverse engineering), this is fine. For others (reproducible builds, diffs in version control), it’s disqualifying. There’s no deterministic mode that locks down the random seed across different API providers.

Verdict

Use if: You’re reverse-engineering minified production code, debugging without source maps, or analyzing third-party libraries where understanding intent matters more than perfect naming. Choose cloud APIs (OpenAI/Gemini) for best results on files under 200KB where $1-2 is acceptable, or local mode when processing proprietary code that can’t leave your infrastructure, working with massive bundles where API costs would exceed $10, or needing unlimited reruns for experimentation. Skip if: You only need basic prettification (Prettier is free and instant), you require deterministic output for CI/CD or version-controlled diffs (LLM variance will cause churn), you’re on CPU-only hardware without budget for cloud APIs (the experience will be frustrating), or you’re working with heavily obfuscated code using string encryption or control flow flattening (Humanify handles minification, not adversarial obfuscation). For webpack-specific bundles without needing AI renaming, use Webcrack directly—it’s what Humanify uses internally anyway.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/jehna-humanify.svg)](https://starlog.is/api/badge-click/developer-tools/jehna-humanify)