Building Safety Rails for AI Code Assistants: Inside claude-code-hooks
Hook
Within 48 hours of deploying Claude Code at a fintech startup, an AI assistant nearly executed rm -rf / on a production server while ‘cleaning up temporary files.’ The only thing that stopped it? A 50-line Node.js script intercepting the command before execution.
Context
AI coding assistants have crossed the threshold from autocomplete toys to autonomous agents that can read, write, and execute code across your entire filesystem. Claude Code, Anthropic’s AI-powered development tool, can spawn bash commands, modify files, and navigate codebases—all while you review pull requests or grab coffee. This power creates a terrifying paradox: the more useful these tools become, the more catastrophic their mistakes.
Traditional safety mechanisms don’t translate well to AI assistants. Git hooks fire too late (after files are already modified). IDE save-guards only catch syntax errors. Containerization breaks the assistant’s ability to access your actual development environment. What’s needed is middleware—a way to intercept AI actions before they touch your system, evaluate them against your policies, and either approve, block, or modify them. This is the gap claude-code-hooks fills: a collection of executable scripts that sit between Claude’s decision-making and your filesystem, acting as programmable guardrails for AI-driven development.
Technical Insight
Claude Code implements a hook system modeled after Git hooks but designed for AI interaction patterns. When Claude decides to execute a bash command or modify a file, it can optionally invoke user-defined executables at specific lifecycle points: pre-tool-use, post-tool-use, and notification. These hooks receive JSON payloads via stdin describing what Claude intends to do, and respond via stdout with approval, denial, or modifications.
The architecture is deliberately simple. Here’s how a basic security hook works:
// security-check.js
const readline = require('readline');
const rl = readline.createInterface({ input: process.stdin });
let buffer = '';
rl.on('line', (line) => { buffer += line; });
rl.on('close', () => {
const event = JSON.parse(buffer);
// Claude is about to run a bash command
if (event.tool === 'Bash' && event.phase === 'pre-tool-use') {
const command = event.parameters.command;
// Check against dangerous patterns
const criticalPatterns = [
/rm\s+-rf\s+\/[^/]*/, // Recursive delete from root
/:\s*\(\s*\)\s*\{/, // Fork bombs
/>\s*\/dev\/sda/, // Direct disk writes
];
for (const pattern of criticalPatterns) {
if (pattern.test(command)) {
console.log(JSON.stringify({
action: 'block',
reason: `Blocked dangerous command: ${command}`
}));
process.exit(0);
}
}
}
// Approve the operation
console.log(JSON.stringify({ action: 'allow' }));
});
This hook configuration lives in .claude/settings.json, where you map hooks to specific tools and phases:
{
"hooks": {
"pre-tool-use": {
"Bash": "node .claude/hooks/security-check.js",
"Write": "node .claude/hooks/file-guard.js"
},
"post-tool-use": {
"Write": "node .claude/hooks/auto-stage.js"
}
}
}
The repository includes three production-ready hooks that demonstrate different interception patterns. The security-check.js hook implements a three-tier threat model (critical/high/strict) that blocks progressively more operations. Critical mode only stops catastrophic commands like recursive deletions and fork bombs. High mode adds protections for package manager operations and network commands. Strict mode becomes paranoid, blocking sudo, environment modifications, and process management.
The file-guard.js hook protects sensitive files by matching against configurable patterns. It prevents Claude from reading .env files, SSH keys, or credentials stored in config directories. This is crucial because Claude’s context window becomes a potential exfiltration vector—if it reads your AWS credentials to ‘help configure deployment,’ those secrets now exist in Anthropic’s servers.
The auto-stage.js hook demonstrates post-operation automation. After Claude modifies files, it automatically stages them with git, adding commit message suggestions based on the changes. This creates a reviewable audit trail of AI modifications without manual git add commands.
What makes this architecture elegant is its statelessness. Hooks don’t maintain databases or persistent connections. Each invocation is independent, receiving everything needed in the input JSON. This means hooks are trivial to test, debug, and compose. The repository includes 262 tests covering edge cases like empty commands, path traversal attempts, and Unicode obfuscation:
// Example test case
test('blocks rm -rf with path traversal', () => {
const event = {
tool: 'Bash',
phase: 'pre-tool-use',
parameters: { command: 'rm -rf /../../../' }
};
const result = runHook(event);
expect(result.action).toBe('block');
expect(result.reason).toContain('dangerous command');
});
The event-logger utility (event-logger.py) is especially clever for hook developers. It acts as a passthrough hook that logs all events to a file while allowing operations to proceed. This lets you observe exactly what payloads Claude generates for different actions, making custom hook development feel less like archaeology.
Gotcha
The hook system’s process-spawning model introduces measurable latency. Every intercepted operation requires Node.js or Python interpreter startup, JSON parsing, pattern matching, and IPC communication. In practice, this adds 50-200ms per operation. For a refactoring session where Claude modifies 30 files, you’re adding 1.5-6 seconds of overhead. This isn’t catastrophic, but it’s perceptible—especially during rapid iteration.
More concerning is the fundamental limitation of heuristic security. The hooks analyze command strings and file paths as text, which means sufficiently obfuscated attacks bypass detection. A command like bash -c $(echo 'cm0gLXJmIC8=' | base64 -d) might slip through naive pattern matching since the decoded payload doesn’t exist in the analyzed string. The repository’s security hooks implement some obfuscation detection (checking for base64, eval, and encoded characters), but this becomes an arms race. You’re not protecting against adversarial AI or compromised Claude instances—you’re preventing accidents and catching obvious mistakes. Treat these hooks as seatbelts, not firewalls.
The hook system also can’t access Claude’s reasoning or modify its context. If Claude is fundamentally confused about your codebase structure, hooks can only reject its actions, not fix its understanding. You’ll end up in loops where Claude attempts an operation, gets blocked, tries a slightly different approach, gets blocked again, and burns through API calls without progress. There’s no mechanism for hooks to inject corrective information back into Claude’s decision-making process.
Verdict
Use if: You’re running Claude Code with access to production systems, repositories containing secrets, or any environment where accidental destructive operations would cause real damage. The security and file-guard hooks provide essential protection against AI-induced disasters for minimal setup cost. Also use if you’re building team workflows around Claude and need audit trails, notifications, or automated git staging—the hook system is the right extension point for this automation. Skip if: You’re working in fully sandboxed development environments where Claude literally cannot harm anything important. The latency overhead isn’t worth it for exploratory coding in disposable containers or test VMs. Also skip if you need deep integration with Claude’s reasoning or want to modify how it plans operations—hooks only see the final tool invocations, not the thinking behind them. For those cases, you’d need to work at the API level or wait for Anthropic to expose deeper extension points.