Repomix: Why Packing Your Entire Codebase Into One File Is the Future of LLM Workflows
Hook
Copy-pasting code into ChatGPT is how 73% of developers share context with LLMs, but they're unknowingly leaking API keys and wasting thousands of tokens on boilerplate. There's a better way.
Context
The explosion of AI-assisted development has created a new problem: how do you give an LLM enough context about your codebase without manually copying files, hitting token limits, or accidentally exposing secrets? Developers resort to crude solutions—pasting individual files, writing custom scripts to concatenate code, or worse, uploading entire repos to third-party services without security checks. This workflow is broken. You need to know which files matter, how many tokens they'll consume across different models (Claude's 200K vs GPT-4's 128K), and whether you're about to send your AWS credentials to an API.
Repomix emerged from this chaos as a purpose-built tool for the LLM-native development workflow. Created by yamadashy and adopted by over 24,000 GitHub users, it's not just another file concatenation script. It's a complete packaging system that respects your .gitignore patterns, counts tokens per file and repository-wide, actively scans for credentials with Secretlint, and optionally compresses code using Tree-sitter to extract only structural elements like function signatures and class definitions. The tool works as a CLI (via npx or global install), a web interface at repomix.com, browser extensions for Chrome/Firefox, and a VSCode plugin—meeting developers wherever they work.
Technical Insight
Repomix's architecture reveals thoughtful decisions about the LLM workflow problem. At its core, it's a Node.js/TypeScript CLI that performs four critical operations: file discovery with intelligent filtering, token counting calibrated to specific models, security scanning, and optional compression. Let's examine how each works.
The file traversal system respects layered ignore patterns. Beyond standard .gitignore parsing, Repomix introduces .repomixignore for AI-specific exclusions—think test fixtures, generated code, or vendored dependencies that bloat token counts without adding semantic value. You can also pass explicit patterns via CLI flags:
npx repomix --ignore "**/*.test.ts,dist/**,node_modules/**" \
--include "src/**/*.ts" \
--output packed-repo.xml
The output format matters more than you'd expect. Repomix defaults to XML with structured tags that LLMs parse reliably:
<file path="src/auth/login.ts">
<![CDATA[
export async function login(email: string, password: string) {
const user = await db.users.findUnique({ where: { email } });
if (!user) throw new UnauthorizedError();
// ... implementation
}
]]>
</file>
This structure helps models maintain file boundaries during analysis. The CDATA sections prevent XML parsing conflicts with code that contains angle brackets. You can also output Markdown (better for human review) or plain text (minimal tokens, but loses metadata).
The token counting subsystem is where Repomix shines. It doesn't use a single generic tokenizer—it maps model names to their specific encodings. Request a count for Claude, and it uses Anthropic's tokenizer. Ask for GPT-4, it switches to OpenAI's tiktoken. This precision matters because the same file might be 1,200 tokens in one model and 1,450 in another, and you need accurate counts when you're near context limits:
npx repomix --style plain --output repo.txt
# Output includes:
# Total Files: 47
# Total Characters: 125,847
# Total Tokens (GPT-4): 32,451
# Total Tokens (Claude): 31,203
The security layer integrates Secretlint with patterns for AWS keys, GitHub tokens, private keys, and 40+ other credential types. During packing, if it detects a match, it halts and reports the file and line number. This isn't foolproof—determined developers can disable it with --no-security-check—but it prevents the most common mistake: uploading .env files or hardcoded API keys to Claude or ChatGPT.
Tree-sitter compression is the most technically ambitious feature. When enabled with --compress, Repomix parses your code into an AST and extracts only structural nodes—function signatures, class declarations, type definitions—while discarding implementation bodies and comments. For a TypeScript service with 50 methods, you might go from 3,000 tokens to 800, keeping enough structure for the LLM to understand architecture without burning context on loop logic. The trade-off is intentional: you lose implementation details but gain the ability to fit larger codebases. Use this when you're asking architectural questions ("How do these modules interact?") rather than debugging specific logic ("Why does this function return null?").
The multi-interface strategy—CLI, web, extensions—reflects a pragmatic reality: developers work in different contexts. The CLI serves automation and CI/CD pipelines. The web UI at repomix.com is for quick experiments without installation. The VSCode extension integrates into the editor where developers already live, letting you right-click a folder and pack it instantly. This isn't feature bloat; it's friction reduction for a workflow that crosses environments.
Gotcha
Repomix's core limitation is mathematical: large monorepos will exceed LLM context windows even after aggressive compression. A 500-file backend service might pack into 150,000 tokens, which fits Claude Sonnet's 200K limit but leaves little room for your actual prompt and the model's response. Compression helps, but extracting only function signatures from a complex domain model often removes the business logic context an LLM needs to give useful advice. You'll find yourself manually editing the packed output to include specific implementations, which defeats the automation purpose. For truly massive codebases, you need chunking strategies or RAG systems—not a single-file approach.
Secretlint scanning, while valuable, introduces false positives and processing overhead. High-entropy strings that look like API keys (random test data, encoded assets) trigger warnings. On a 200-file repo, expect to spend a few minutes reviewing flagged lines that are actually harmless. The scan also adds 20-30% to packing time. More concerning: developers under deadline pressure will reach for --no-security-check, bypassing the safety net entirely. The tool can't enforce security—it can only make it convenient, and convenience loses to speed when you're rushing. Also, Secretlint doesn't catch every secret pattern (custom internal auth tokens, database URLs with credentials), so it's a helpful heuristic, not a guarantee.
Verdict
Use if: You're working on small-to-medium codebases (under 50K tokens) and regularly need to give LLMs full repository context for refactoring, code review, or architectural analysis. The token counting alone justifies adoption—it saves you from trial-and-error uploads that hit model limits halfway through. The security scanning is legitimately useful if you're in an organization where credential leaks have consequences. It's also excellent for onboarding scenarios: pack your repo, feed it to Claude with "Explain this codebase's architecture," and get a surprisingly good overview. Skip if: You're dealing with massive monorepos that will exceed context limits regardless, or you already have sophisticated tooling like Cursor or GitHub Copilot Workspace that handle repository context intelligently without manual packing. Also skip if your workflow requires fine-grained control over which code sections get included—Repomix's ignore patterns work, but manually curating a context file might be more precise. Finally, avoid if you need iterative AI collaboration; tools like Aider that maintain conversation state and understand git diffs are better for back-and-forth coding sessions than one-shot packing.