gf: How a 200-Line Go Wrapper Turned grep Into a Security Research Power Tool

Hook

Security researchers who audit hundreds of codebases don't memorize complex regex patterns—they built a tiny Go wrapper that stores them as JSON files and made grep 10x more productive.

Context

Anyone who's done security auditing or extensive code reviews knows the pain: you find yourself typing the same monstrous grep commands over and over. grep -r -E '(system|exec|passthru|shell_exec)\s*\(' . --include='*.php' to find potential PHP RCE vectors. grep -HnrE '['"]\s*(https?://|//)[^'"]*' for hardcoded URLs. Each pattern is a landmine of regex escaping, flag combinations, and typos that can silently return incomplete results.

The traditional solutions were inadequate. Shell aliases break down when you need different variations of the same pattern. Copying from a personal wiki means constantly context-switching. IDE-integrated search is powerful but tied to a specific editor and project context. TomNomNom (Tom Hudson), a security researcher known for his bug bounty tools, built gf to solve his own workflow friction: a wrapper that turns grep patterns into version-controlled, shareable JSON files with memorable names. Instead of memorizing regex, you type gf aws-keys or gf php-sinks and let the tool handle the complexity. What started as a personal productivity hack became essential infrastructure for security researchers worldwide.

Technical Insight

System architecture — auto-generated

The genius of gf is in what it doesn't do. At its core, it's remarkably simple: read a JSON file, construct a command, execute it. The entire main package is under 200 lines of Go code. When you run gf jwt, the tool looks for ~/.gf/jwt.json, parses the pattern definition, and spawns grep (or whatever search engine you've configured) with the specified flags and patterns.

Here's what a pattern file looks like:

{
    "flags": "-HnrE",
    "patterns": [
        "eyJ[A-Za-z0-9_-]*\\.[A-Za-z0-9_-]*\\.[A-Za-z0-9_-]*"
    ]
}

This JWT pattern demonstrates the format: flags contains the grep options (H for filename, n for line numbers, r for recursive, E for extended regex), and patterns is an array of regex strings. When you run gf jwt in a directory, gf transforms this into grep -HnrE 'eyJ[A-Za-z0-9_-]*\.[A-Za-z0-9_-]*\.[A-Za-z0-9_-]*' . and executes it.

The power comes from composability and engine swapping. You can specify multiple patterns in a single file, and gf will run grep once per pattern. More importantly, you can override the default engine:

{
    "engine": "ag",
    "flags": "--php",
    "patterns": [
        "\\$_(GET|POST|COOKIE|REQUEST|SERVER)",
        "file_get_contents\\s*\\(\\s*\\$",
        "eval\\s*\\("
    ]
}

This pattern uses The Silver Searcher (ag) instead of grep, leveraging ag's built-in file type filtering with the --php flag. The architecture is deliberately engine-agnostic—as long as your search tool accepts similar command-line arguments, gf can orchestrate it. Security researchers have created pattern files for ripgrep, ack, and even custom search tools.

The workflow integration is where gf shines in practice. Combined with shell completion (gf provides scripts for bash and zsh), you can tab-complete pattern names. Create a pattern on the fly with gf -save api-endpoints, and gf captures the last command as a reusable pattern. This transforms grep from a one-off search into a curated library of institutional knowledge.

For security work, the real value is standardization. A team can maintain a shared repository of patterns for common vulnerability classes:

{
    "flags": "-HnriE",
    "patterns": [
        "(sql|mysql|pg)_?query\\s*\\([^,]*\\$",
        "(SELECT|INSERT|UPDATE|DELETE).*\\$",
        "\\$.*WHERE.*\\$",
        "mysqli?_.*\\(.*\\$.*\\)"
    ]
}

This SQL injection pattern file (saved as sqli.json) looks for common PHP database query patterns where user input might be concatenated. Run gf sqli across a legacy codebase and you'll surface potential injection points in seconds. The JSON format means you can version control these patterns alongside your security playbooks, continuously refining them as you discover new vulnerability patterns. Bug bounty hunters share pattern collections on GitHub—entire taxonomies of XSS vectors, SSRF indicators, and secrets patterns that represent hundreds of hours of research compressed into JSON files.

The implementation is intentionally minimal. The Go code simply reads environment variables for the patterns directory, scans for available JSON files, parses the requested pattern, and uses exec.Command to spawn the search process. There's no daemon, no index, no background processing. This simplicity is architectural—gf is designed to stay out of your way and leverage the decades of optimization in grep and its modern alternatives.

Gotcha

The minimalism that makes gf elegant also defines its boundaries. Pattern files have no parameter support, which means you can't create a generic "find function calls" pattern and specify the function name at search time. If you need to search for both exec() and system() calls separately, you either put both in one pattern (matching everything) or maintain two separate pattern files. There's no pattern composition or inheritance, so similar patterns require duplication.

The ecosystem is fragmented. While several developers have shared pattern collections on GitHub, there's no canonical repository or package manager for community patterns. You're manually copying JSON files from various sources, with no versioning or dependency management. The pattern format also lacks documentation fields—six months later, you might not remember why you created a specific pattern or what edge cases it covers. You end up maintaining a separate README to document your pattern library.

Engine compatibility can be surprising. Since gf just shells out to whatever binary you specify, there's no validation that your flags are compatible with the engine. Switch a pattern from grep to ripgrep and your -E flag might not work as expected. The error messages come directly from the underlying tool, which can be cryptic if you've abstracted away which engine is actually running. This is the price of engine-agnostic design—gf treats all search tools as black boxes.

Verdict

Use if: You regularly audit codebases for security issues, perform code reviews with recurring pattern searches, or find yourself scrolling through shell history looking for "that one grep command." It's essential for bug bounty hunters, penetration testers, and security teams who need to run standardized searches across multiple projects. The ability to version control and share pattern libraries makes it invaluable for teams that want to codify their security knowledge. Skip if: You only occasionally grep, work primarily within an IDE with powerful search features, or need parameterized patterns with conditional logic—in those cases, you'll spend more time working around gf's simplicity than benefiting from it. Also skip if you're already heavily invested in tools like semgrep that provide AST-aware pattern matching; gf's text-based approach is fundamentally different and won't replace structured code analysis.

gf: How a 200-Line Go Wrapper Turned grep Into a Security Research Power Tool

gf: How a 200-Line Go Wrapper Turned grep Into a Security Research Power Tool

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

gf: How a 200-Line Go Wrapper Turned grep Into a Security Research Power Tool

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]