Mods: Bringing LLMs into Your Unix Pipeline Before It Disappears

Hook

The most elegant AI CLI tool you'll use is being sunset in March 2026, but its pipeline-first philosophy deserves your attention before it's gone.

Context

For decades, Unix users have composed powerful workflows by chaining small, specialized tools through pipes. Need to count error lines in a log? grep ERROR app.log | wc -l. Want to monitor system resources? ps aux | sort -nrk 3,3 | head -5. This composability is Unix's superpower. But when Large Language Models arrived, the tooling ecosystem fractured. Most AI interfaces were web dashboards, Electron apps, or Python scripts that treated the terminal as an afterthought. You couldn't pipe your access logs to GPT-4 for analysis, or feed compiler errors into Claude for debugging suggestions—at least not without writing glue code.

Mods emerged from Charmbracelet to solve this impedance mismatch. Rather than building yet another chat interface, the team asked: what if AI was just another filter in your pipeline? What if you could cat error.log | mods "explain these errors" as naturally as you'd pipe to grep or awk? The tool reads from stdin, combines it with your prompt, queries an LLM, and writes to stdout. It's deceptively simple, but this Unix-first design unlocks AI integration in places you wouldn't expect: Git hooks, CI/CD scripts, system monitoring, even shell aliases. Though Charmbracelet announced Mods will sunset on March 9, 2026 in favor of their new tool Crush, the architectural lessons here matter for anyone building CLI tools in the AI era.

Technical Insight

Mods' architecture is straightforward: it's a Go binary that orchestrates API calls while respecting Unix conventions. When you run mods "your prompt", it checks for stdin input, combines it with your prompt text, sends the payload to your configured LLM provider, and streams the response back. The magic is in the details.

The tool supports multiple providers through a unified interface—OpenAI, LocalAI, Cohere, Groq, and Azure OpenAI. Configuration lives in ~/.config/mods/mods.yml, where you define API keys, default models, and formatting preferences. Here's a practical example of piping a Git diff through Mods for commit message generation:

# Generate a commit message from staged changes
git diff --staged | mods "Write a conventional commit message for these changes"

# Output:
# feat(auth): add JWT token refresh mechanism
#
# - Implement refresh token rotation in authentication middleware
# - Add expiration checking with 5-minute buffer
# - Update tests for token lifecycle scenarios

This works because Mods treats stdin as context, not just input. The LLM receives both the piped content and your prompt, allowing it to analyze, transform, or summarize the data. You can chain this further:

# Find the most memory-intensive processes and get optimization suggestions
ps aux | sort -nrk 4,4 | head -10 | mods "What might be causing high memory usage?"

Conversation persistence uses SHA-1 identifiers, borrowing from Git's design. When you start a conversation, Mods generates a hash like a3f5b2c. You can continue that conversation later with mods --continue a3f5b2c "follow-up question". Conversations are stored as JSON files in ~/.local/share/mods/conversations/, making them greppable and version-controllable. Unlike cloud-based chat interfaces, your history is local, plaintext, and yours.

The roles system deserves special attention. In ~/.config/mods/mods.yml, you define reusable system prompts:

roles:
  shell-expert:
    name: "Shell Expert"
    message: "You are a Unix shell expert. Provide concise, safe commands. Explain dangerous operations. Always include error handling."
  code-reviewer:
    name: "Code Reviewer"
    message: "You are a senior code reviewer. Focus on bugs, security issues, and performance problems. Be specific about line numbers."

Then invoke them with mods --role shell-expert "how do I recursively delete empty directories?". This transforms Mods from a generic AI wrapper into a specialized tool for different contexts. You could create roles for SQL optimization, AWS troubleshooting, or Kubernetes debugging—whatever matches your workflow.

The formatting system handles syntax highlighting automatically. When Mods detects code in LLM responses, it applies terminal highlighting using Charmbracelet's Glamour library. You can disable this with --raw for machine-readable output, or use --format to force Markdown processing. This makes it pipeline-friendly in both human and script contexts:

# Human-readable with colors
cat main.go | mods "find potential race conditions"

# Machine-readable for further processing
cat main.go | mods --raw "list function names" | sort | uniq

Because it's written in Go, Mods compiles to a single static binary with no runtime dependencies. Installation is brew install mods or downloading a release. Compare this to Python-based alternatives requiring virtual environments, dependency resolution, and version management. For shell integration, that simplicity matters—you can drop Mods into Docker images, CI runners, or remote servers without dependency hell.

Gotcha

The elephant in the room: Mods is being sunset on March 9, 2026. Charmbracelet is moving development to Crush, their next-generation tool focused on non-interactive mode with crush run. While Mods will continue working after that date (it's open source, after all), you won't get bug fixes, security updates, or new LLM provider support from the original team. If you're building production automation around Mods, you need a migration plan. The good news is Crush aims to maintain similar pipeline semantics, but the bad news is you'll need to rewrite configurations and potentially adjust scripting logic.

Token limits and API costs can surprise you. When you pipe large files through Mods, you're sending that entire content to the LLM provider. A 50KB log file might consume 15,000 tokens, costing real money on services like GPT-4. Mods has no built-in cost controls, usage warnings, or automatic truncation. You can accidentally rack up API bills by piping verbose outputs through expensive models. Always check your provider's pricing and consider setting up billing alerts. For large-scale usage, LocalAI with self-hosted models makes more sense, but setup complexity increases dramatically.

The conversation history system, while elegant, lacks advanced features like branching, merging, or searching across conversations. You get linear conversation threads identified by SHA-1 hashes, but no way to search "what did I ask about Kubernetes last month?" without manually grepping JSON files. The --list flag shows recent conversations, but organization becomes unwieldy over time. Third-party tools like fzf can help, but you'll need to build that integration yourself.

Verdict

Use Mods if: you want to experiment with AI-powered shell workflows right now, you value Unix pipeline composability over feature completeness, you're comfortable with a tool that has an expiration date, or you need a lightweight way to integrate multiple LLM providers into existing scripts without heavy dependencies. It's perfect for one-off log analysis, Git workflow automation, or learning how AI can augment command-line work. Just go in with eyes open about the 2026 sunset and plan accordingly. Skip if: you need long-term support guarantees for production systems (migrate to Crush instead), you want advanced conversation management with search and branching, you prefer GUI interfaces, or you need agent frameworks with tool use and complex orchestration. Mods is a focused pipeline tool, not a comprehensive AI platform, and its days are numbered—use it as inspiration or a stopgap, not a foundation.

Mods: Bringing LLMs into Your Unix Pipeline Before It Disappears

Mods: Bringing LLMs into Your Unix Pipeline Before It Disappears

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Mods: Bringing LLMs into Your Unix Pipeline Before It Disappears

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]