lean-ctx: Building a Context Layer for AI Coding Tools with Rust

Hook

Your AI coding assistant reads entire files to answer simple questions, burning through context windows and dollars. One developer watched their Cursor session consume 2.4M tokens reading minified JSON, build logs, and node_modules—and built a Rust-powered context layer to fix it.

Context

AI coding assistants like Cursor, Claude Code, and GitHub Copilot have transformed how we write code, but they share a fundamental problem: they're context gluttons. Ask Copilot about a function signature, and it might read the entire 5,000-line file. Run a CLI command through Aider, and it feeds the raw output—verbose logs, stack traces, ANSI codes and all—straight into the LLM's context window. The result? Ballooning token costs, degraded performance as context windows fill, and zero visibility into what these tools actually consume.

Developers have worked around this with manual .cursorrules files, custom prompts, and careful file organization. But these are Band-Aids. What's missing is a proper context layer—something that sits between your codebase and the AI, intelligently compressing what gets sent, tracking every token, and enforcing governance policies. lean-ctx is that layer: a Rust binary implementing the Model Context Protocol (MCP) with 59 tools, 10 read modes, and deep integration into shell environments. It's not another AI assistant; it's infrastructure for the ones you already use.

Technical Insight

System architecture — auto-generated

lean-ctx's architecture centers on interception and transformation. It operates as both an MCP server—exposing tools that AI assistants call for file operations—and as shell hooks that capture CLI output before it reaches the LLM. The elegance is in how it decides what to compress and what to preserve.

The file reading system uses Tree-sitter for AST-aware parsing across 21 languages. Instead of a single read operation, lean-ctx offers 10 modes. The map mode returns only function and class signatures with location information. The signatures mode strips implementation details entirely, returning type information and interfaces. The diff mode shows only changed regions with surrounding context. Here's what the tool returns when you ask for a Python file in map mode versus full mode:

// Map mode output - ~47 tokens for a 500-line file
{
  "type": "map",
  "symbols": [
    {"kind": "class", "name": "TokenTracker", "line": 12, "children": [
      {"kind": "method", "name": "__init__", "line": 13},
      {"kind": "method", "name": "record_usage", "line": 18},
      {"kind": "method", "name": "get_stats", "line": 34}
    ]},
    {"kind": "function", "name": "compress_context", "line": 89},
    {"kind": "function", "name": "verify_proof", "line": 142}
  ]
}

// Full mode - 12,000+ tokens for the same file
// (entire source code)

When an AI assistant requests read_file("tracker.py") through MCP, lean-ctx's decision engine looks at the query context, recent access patterns, and configured policies to automatically select the appropriate mode. If you're asking "what methods does TokenTracker have?", map mode suffices. If you're debugging a specific function, it switches to focused mode showing just that function with full implementation.

The shell output compression is where things get surgical. lean-ctx hooks into your shell (bash, zsh, fish) and intercepts command output before it reaches tools like Aider or Claude Code that operate in terminal environments. It maintains 56 pattern modules—essentially compression rules for specific CLI tools. The npm install pattern, for example, strips progress bars, peer dependency warnings, and audit reports, keeping only the final success message and any actual errors. The git log pattern condenses commit metadata while preserving SHAs and messages:

// Raw git log output: ~850 tokens
commit a3f2d9b8c1e4f6a8d9b2c3e4f5a6b7c8d9e0f1a2
Author: Alice Dev <alice@example.com>
Date:   Mon Jan 15 14:32:18 2024 -0800

    feat: add token compression engine
    
    - Implemented Tree-sitter integration
    - Added 10 read modes with AST parsing
    - Built pattern engine for CLI tools
    ...(40 more lines of detail)

commit b4e3c8d9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5
...(hundreds more lines)

// Compressed output: ~120 tokens
Recent commits:
a3f2d9b feat: add token compression engine
b4e3c8d fix: handle edge case in diff mode
c5d4e3f docs: update installation guide
(showing 3 of 47 commits, use --full for complete log)

The governance system is less common in developer tools but critical for production AI workflows. lean-ctx treats every context exchange as an auditable event. When an AI assistant reads a file or executes a command, lean-ctx generates a cryptographic proof—a hash of what was actually sent to the LLM, signed with a timestamp and budget accounting. You configure profiles with token budgets (e.g., 100K tokens per session), role restrictions (which directories can be accessed), and quality gates (require human approval for certain operations). The 4-level verification engine checks: (1) role permissions, (2) budget availability, (3) content policy compliance, and (4) cryptographic proof validity.

The knowledge graph component maintains state across sessions. It's a temporal graph where nodes are facts ("function X calls function Y", "file A imports module B") with validity timestamps. When you refactor code, lean-ctx updates the graph's temporal dimension rather than deleting old facts—preserving the history of your codebase's structure. This powers cross-session recovery: if your AI assistant crashes mid-task, the next session can query "what was I working on?" and get structured context from the graph. The hybrid search combines BM25 full-text search, embedding-based semantic search, and graph proximity (facts near recently accessed nodes rank higher) to surface relevant context without manual specification.

Gotcha

lean-ctx's comprehensive feature set is both its strength and primary limitation. With 59 MCP tools, 95+ compression patterns, and multiple subsystems (knowledge graph, governance, shell hooks, LSP integration), the learning curve is steep. The documentation covers the breadth, but troubleshooting requires understanding which layer is involved—is your compression not working because of a pattern mismatch, a read mode configuration, or shell hook installation? The error messages are improving but still assume familiarity with the architecture.

The local-only architecture is a deliberate choice for privacy and zero cloud dependency, but it means no built-in team collaboration. If you're working with a team, each developer maintains their own knowledge graph and context packages. There's no synchronization mechanism for shared learned patterns or coordinated governance policies. You can export/import context packages manually, but this isn't a seamless team experience. For organizations wanting centralized policy enforcement and shared context across developers, you'll need to build wrapper tooling or wait for future features. The project is under active development (1,827 stars and climbing rapidly suggests both interest and potential API instability), and the dashboard is marked beta—expect some rough edges and breaking changes.

Verdict

Use lean-ctx if you're working on large codebases with AI assistants and your token costs are becoming material, you need audit trails and governance for AI-assisted development (particularly in regulated industries), or you want deep observability into what context your AI tools actually consume. It's especially valuable if you're frustrated by AI assistants reading irrelevant files or generating bloated responses. The tool shines in production AI workflows where context management is a first-class concern. Skip it if you're just experimenting with AI coding tools, prefer minimal configuration over powerful features, need cloud-based team collaboration for shared knowledge and policies, or work on small projects where token optimization doesn't move the needle. lean-ctx is comprehensive infrastructure—its value scales directly with the complexity and scale of your AI-assisted development workflow.

lean-ctx: Building a Context Layer for AI Coding Tools with Rust

lean-ctx: Building a Context Layer for AI Coding Tools with Rust

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

lean-ctx: Building a Context Layer for AI Coding Tools with Rust

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Headroom: The Three-Layer Compression Stack That Makes LLM Context Windows 60% Cheaper

GSD Core: Why This Tool Spawns a Fresh AI Context for Every Coding Task

Chipotlai Max: Reverse-Engineering Corporate Chatbots for Free LLM Inference

Running Gemma-4 26B on DGX Spark: Why Speculative Decoding Falls Apart at Scale

Headroom: The Three-Layer Compression Stack That Makes LLM Context Windows 60% Cheaper

GSD Core: Why This Tool Spawns a Fresh AI Context for Every Coding Task

Chipotlai Max: Reverse-Engineering Corporate Chatbots for Free LLM Inference

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]