Gepetto: Teaching IDA Pro to Think with Language Models

Hook

Reverse engineers spend 70% of their time simply understanding what decompiled code does—not finding vulnerabilities or extracting IOCs. Gepetto uses LLMs to collapse that timeline from hours to minutes.

Context

Reverse engineering is a cognitive endurance sport. You're staring at pseudocode generated by IDA's HexRays decompiler—functions named sub_401000, variables called v7 and a3, and logic that's technically correct but semantically opaque. Understanding a single function might require tracing through API calls, recognizing common patterns (is this parsing JSON or implementing RC4?), and keeping dozens of context threads in your head simultaneously.

Traditionally, reverse engineers built this understanding through pattern recognition earned over thousands of hours. You learn to spot string formatting operations, recognize standard library implementations, and infer intent from API call sequences. But this expertise bottleneck means junior analysts struggle with unfamiliar code, and even senior researchers waste time on boilerplate recognition. When OpenAI released GPT-3.5 in late 2022, someone inevitably asked: could language models—trained on millions of lines of documented source code—accelerate this understanding phase? Gepetto emerged as one of the first production-ready answers, turning IDA Pro into an AI-assisted analysis platform.

Technical Insight

Gepetto's architecture is elegantly simple: it's a bridge between IDA's Python API and LLM providers. When you right-click a function in IDA's decompiler view and select "Explain function," Gepetto extracts the pseudocode, constructs a prompt with reverse-engineering context, sends it to your configured LLM, and displays the response in IDA's output window. The magic isn't in complex algorithms—it's in workflow integration.

The plugin registers itself with IDA's action system, adding context menu items and keyboard shortcuts. Here's the core interaction pattern from the codebase:

class GepettoPlugin(idaapi.plugin_t):
    def explain_function(self):
        # Get current function's decompiled pseudocode
        decompiler_output = ida_hexrays.decompile(ida_kernwin.get_screen_ea())
        if not decompiler_output:
            print("[Gepetto] HexRays decompiler not available")
            return
        
        # Extract the pseudocode as string
        pseudocode = str(decompiler_output)
        
        # Construct prompt with RE-specific instructions
        prompt = f"""Analyze this decompiled function and explain:
- What the function does at a high level
- Notable algorithms or patterns
- Potential security concerns

{pseudocode}"""
        
        # Send to configured LLM provider
        response = self.model_manager.query(prompt)
        
        # Display in IDA's output window
        print(f"\n[Gepetto Analysis]\n{response}")

What makes this powerful is HexRays pseudocode as the input format. Unlike raw assembly, decompiled C-like code preserves control flow, type information, and structural semantics—exactly what LLMs trained on source code understand. When you feed an LLM if (v7 == 0x5A4D) { ... }, it can recognize PE file magic number validation. Feed it raw assembly cmp eax, 0x5A4D; jnz loc_401234, and the pattern recognition degrades significantly.

Gepetto supports three distinct workflows, each with specialized prompts. "Explain function" generates high-level summaries. "Rename variables" suggests semantically meaningful names based on usage patterns—transforming v7 into pe_header or socket_fd. The "Add comments" feature annotates the pseudocode inline, which Gepetto then writes back into IDA's database using idc.set_cmt() calls. This round-trip integration means the AI's insights persist in your IDB file for future sessions.

The model manager abstraction is particularly clever. Rather than hardcoding OpenAI's API, Gepetto implements a provider interface supporting OpenAI, Anthropic, Google, Mistral, OpenRouter, and local models via Ollama or LM Studio. Configuration lives in an INI file:

[ModelProvider]
provider = ollama
model = llama3.2:latest
api_key = not_needed_for_local

[General]
language = en_US
max_tokens = 2048

This design choice reflects a key insight: reverse engineers often work with sensitive binaries (malware samples, proprietary firmware) where sending code to OpenAI violates security policies. Local model support via Ollama means you can run Gepetto entirely offline with models like CodeLlama or DeepSeek Coder, trading some accuracy for complete data control.

The internationalization support—with translations for English, French, Russian, and Korean—signals that this tool has seen real-world use across security teams. Kaspersky initially sponsored development, and French cybersecurity firm HarfangLab currently maintains it. When enterprise security companies fund open-source RE tools, it usually means the tool solved a genuine productivity problem.

Gotcha

The most obvious limitation is the IDA Pro barrier to entry. Gepetto requires not just IDA Pro (starting at $1,879 for a named license) but specifically the HexRays decompiler add-on (another $2,709). That's nearly $5,000 in tooling before Gepetto adds any value. If you're using Ghidra or Binary Ninja, you'll need alternative plugins like GhidrAI or BinjaGPT. This isn't Gepetto's fault—it's tightly coupled to IDA's plugin API—but it does limit the potential audience to professional reverse engineers at companies with tool budgets.

The more subtle issue is LLM hallucination in a high-stakes context. When ChatGPT invents a plausible-sounding but incorrect function explanation, and you base your malware analysis report on that hallucination, the consequences range from wasted time to incorrect threat attribution. The Gepetto README explicitly warns users to "stay critical of the model's output," but in practice, confirmation bias is real. If an LLM confidently explains that a function implements AES encryption when it's actually RC4, you might not catch the error until much later in your analysis. Traditional reverse engineering forces you to trace every step; AI assistance can make you lazy.

API costs and rate limits add friction. OpenAI's GPT-4 costs $0.03 per 1K tokens for prompts—a complex function might cost $0.15-0.50 to analyze. Over a workday analyzing dozens of functions, costs accumulate. Claude and Gemini have different pricing structures and rate limits. Local models avoid costs but require GPU resources and produce lower-quality output. There's no perfect solution, just tradeoffs to manage based on your budget and requirements.

Verdict

Use if: You're a professional reverse engineer with IDA Pro + HexRays who regularly encounters unfamiliar binaries (malware analysis, vulnerability research, proprietary protocol reverse engineering). Gepetto excels at the initial reconnaissance phase—quickly generating hypotheses about what functions do, recognizing common algorithms, and suggesting variable names. It's particularly valuable when context-switching between projects or onboarding to new codebases, where you need to build mental models fast. If your organization has data sensitivity concerns, the Ollama integration makes local deployment viable. Skip if: You're using Ghidra or Binary Ninja (check their respective AI plugins instead), working with familiar codebases where you already understand the context, or doing forensics work where you cannot risk LLM hallucinations influencing your findings. Also skip if you're a hobbyist without an IDA license—the $5K tool investment makes Gepetto economically irrational unless you're billing clients or employed by a security firm. For learning reverse engineering fundamentals, traditional manual analysis builds skills better than AI-assisted shortcuts.

Gepetto: Teaching IDA Pro to Think with Language Models

Gepetto: Teaching IDA Pro to Think with Language Models

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Gepetto: Teaching IDA Pro to Think with Language Models

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

ds4: The SSD-Streaming Inference Engine That Treats Your Mac's NVMe Like RAM

Harness-1: Training Search Agents with State Externalization

makemore: Understanding Language Models by Implementing Them Seven Different Ways

JARVIS: The LLM-Orchestrated AI System That Pioneered Multi-Model Task Automation

ds4: The SSD-Streaming Inference Engine That Treats Your Mac's NVMe Like RAM

Harness-1: Training Search Agents with State Externalization

makemore: Understanding Language Models by Implementing Them Seven Different Ways

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]