Building Safety Rails for AI Coding Assistants with Claude Code Hooks

Hook

Your AI coding assistant just tried to run rm -rf / after misinterpreting your refactoring request. Wouldn't it be nice if something caught that before execution?

Context

AI coding assistants like Claude Code, GitHub Copilot, and Cursor have fundamentally changed how developers write code. They're remarkably good at generating boilerplate, refactoring logic, and even architecting solutions. But there's an uncomfortable truth: these tools can execute commands with the same permissions you have. When Claude Code interprets "clean up the build artifacts" as "delete everything in the current directory," or accidentally echoes your AWS credentials into a log file, you need more than an undo button.

Traditional safety mechanisms don't map well to AI assistants. Pre-commit hooks run too late—after the AI has already modified files. IDE extensions can't intercept command execution. And the AI providers themselves can't anticipate every dangerous scenario in your specific codebase. The claude-code-hooks repository tackles this gap by providing an interception layer that sits between Claude Code's decision-making and actual execution, using the hook system that Anthropic built directly into their coding assistant.

Technical Insight

The architecture of claude-code-hooks is elegantly simple: it's a collection of Node.js scripts that communicate with Claude Code through JSON over stdin/stdout. When Claude Code is about to execute a tool—whether that's running a bash command, reading a file, or editing code—it can invoke a hook script first. The hook receives the action details, applies whatever logic you've defined, and returns a decision: allow, block, or modify.

Here's what a basic PreToolUse hook looks like in practice:

const readline = require('readline');
const rl = readline.createInterface({ input: process.stdin });

let inputData = '';
rl.on('line', (line) => { inputData += line; });

rl.on('close', () => {
  const payload = JSON.parse(inputData);
  const { tool, params } = payload;
  
  // Block dangerous rm commands
  if (tool === 'Bash' && params.command.includes('rm -rf /')) {
    console.log(JSON.stringify({
      action: 'block',
      message: 'Prevented potentially destructive rm command'
    }));
    return;
  }
  
  // Allow everything else
  console.log(JSON.stringify({ action: 'allow' }));
});

This pattern—read JSON from stdin, process it, write JSON to stdout—makes hooks language-agnostic. While this repository uses JavaScript, you could write hooks in Python, Go, or even Bash scripts. The contract is simple: Claude Code serializes the tool invocation details, your hook script decides what to do, and Claude Code respects that decision.

The real sophistication comes in how the existing hooks implement safety checking. The security-hook.js, for instance, uses a multi-tiered threat model. It categorizes dangers into severity levels: CRITICAL (commands that could destroy data or expose credentials), HIGH (operations that could compromise security), and STRICT (anything requiring manual review). Here's the pattern matching logic for detecting secret exposure:

const SECRET_PATTERNS = [
  /AWS_SECRET_ACCESS_KEY/i,
  /PRIVATE_KEY/i,
  /-----BEGIN (RSA|OPENSSH) PRIVATE KEY-----/,
  /ghp_[a-zA-Z0-9]{36}/, // GitHub personal access tokens
  /sk-[a-zA-Z0-9]{48}/, // OpenAI API keys
];

function checkForSecrets(content, filepath) {
  for (const pattern of SECRET_PATTERNS) {
    if (pattern.test(content)) {
      return {
        action: 'block',
        severity: 'CRITICAL',
        message: `Prevented ${filepath} operation: content contains potential secrets`,
        pattern: pattern.toString()
      };
    }
  }
  return null;
}

The hook system supports multiple lifecycle stages beyond just PreToolUse. PostToolUse hooks run after execution, allowing you to audit what actually happened, send notifications, or validate outputs. The notification-hook.js demonstrates this by integrating with Slack, Discord, or custom webhooks whenever Claude Code performs significant actions:

// PostToolUse notification example
if (tool === 'Write' && params.path.endsWith('.env')) {
  await sendSlackNotification({
    text: `⚠️ Claude Code modified .env file`,
    details: {
      file: params.path,
      timestamp: new Date().toISOString(),
      user: process.env.USER
    }
  });
}

What makes this repository particularly valuable as a learning resource is the event-logger utility. Before writing custom hooks, you need to understand what data Claude Code actually sends. The event logger captures real payloads so you can see exactly what properties are available for each tool type. This is crucial because the hook API isn't extensively documented—the logger becomes your specification.

The test suite reveals another architectural decision worth noting: hooks are designed to be fast and fail-safe. Each hook has a timeout (typically 5 seconds), and if a hook crashes or hangs, Claude Code assumes "allow" rather than blocking your workflow. This means your safety logic needs to be defensive—validate inputs, handle edge cases, and never assume the payload structure. The 262 passing tests cover scenarios like malformed JSON, missing properties, and edge cases in pattern matching that would cause naive implementations to crash and fail open.

Gotcha

The fundamental limitation is that hooks are only as smart as their pattern matching. If you write a hook that blocks commands containing "rm -rf", Claude Code could potentially generate r''m -rf or use a variable like CMD='rm -rf' && $CMD. While Claude Code isn't actively trying to evade your hooks (unlike a malicious actor), the probabilistic nature of LLM outputs means dangerous commands can be phrased in unexpected ways. You're playing an asymmetric game: your security patterns need to catch every possible dangerous variant, while the AI only needs to generate one pattern you didn't anticipate.

The repository's current collection is also quite limited—just four production hooks despite significant community interest (373 stars). This suggests you'll be writing custom hooks for most real-world scenarios. The documentation is minimal beyond code comments, so expect to spend time with the event logger understanding payload structures. There's also no standardized hook distribution mechanism; you're copying JavaScript files into your project and maintaining them yourself. Updates to Claude Code's hook API could break your hooks without warning, and since this is a community project rather than officially supported by Anthropic, you're on your own for compatibility issues.

Verdict

Use if: You're actively using Claude Code in a team environment where AI mistakes could have real consequences—exposing secrets, deleting production data, or making unauthorized API calls. The security-hook alone justifies adoption as an insurance policy against expensive accidents. Also use this if you're building your own AI coding tools and want to understand how event-driven safety mechanisms work; the patterns here are applicable beyond Claude Code. Skip if: You're using other AI assistants like Cursor or GitHub Copilot (hooks won't work outside Claude Code), or if you're a solo developer willing to carefully review every AI action manually. Also skip if you need enterprise-grade policy enforcement with audit trails and compliance certifications—this is a community tool without those guarantees. Consider alternatives like aider with its built-in git safety or GitHub Copilot Enterprise if you need vendor support.

Building Safety Rails for AI Coding Assistants with Claude Code Hooks

Building Safety Rails for AI Coding Assistants with Claude Code Hooks

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Building Safety Rails for AI Coding Assistants with Claude Code Hooks

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]