Back to Articles

gf: How a 200-Line Go Wrapper Solved grep's Muscle Memory Problem

[ View on GitHub ]

gf: How a 200-Line Go Wrapper Solved grep’s Muscle Memory Problem

Hook

Security researchers running grep -HnrE '(\$_(POST|GET|COOKIE|REQUEST|SERVER|FILES)|php://(input|stdin))' * regularly created a tool to never type that pattern again.

Context

Anyone who’s spent time auditing codebases, analyzing logs, or hunting for vulnerabilities knows a common grep power-user challenge: the more complex your patterns get, the harder they become to type accurately. A misplaced escape character, forgotten flag, or typo in a complex regex means zero results and no way to know if you failed to find anything or just failed to type correctly.

Tom Hudson (tomnomnom) built gf to solve this specific friction point. Rather than maintaining a text file of patterns to copy-paste, building elaborate bash aliases, or relying on shell history search, gf introduces a simple concept: give your grep patterns memorable names, store them as JSON files, and invoke them like commands. The tool emerged from workflows involving code auditing, analyzing output from meg, and generally dealing with large amounts of data where the same patterns get searched repeatedly.

Technical Insight

Pattern Structure

pattern name

--save flag

write JSON

read

flags + pattern

grep command

CLI Invocation

gf pattern-name

Pattern JSON Files

~/.gf/*.json

Pattern Parser

Read & Parse JSON

Command Builder

Construct grep command

Execute grep/ag

with flags & pattern

Output Results

to stdout

Save Mode

--save flag

System architecture — auto-generated

gf’s architecture is deliberately minimal—it’s a wrapper, not a replacement. At its core, the tool reads JSON pattern files from ~/.gf/, constructs a grep command with the specified flags and patterns, and executes it against your target files or stdin.

A pattern file is just a JSON object with two required fields:

{
    "flags": "-HnrE",
    "pattern": "(\\$_(POST|GET|COOKIE|REQUEST|SERVER|FILES)|php://(input|stdin))"
}

When you run gf php-sources, the tool reads ~/.gf/php-sources.json, extracts the flags and pattern, and executes the equivalent of grep -HnrE '(\$_(POST|GET|COOKIE|REQUEST|SERVER|FILES)|php://(input|stdin))' with any additional arguments you provide. The pattern name becomes the interface; the complex regex stays hidden in version-controlled JSON.

For patterns that logically group multiple searches, gf supports a patterns array instead of a single pattern string:

{
    "flags": "-HnrE",
    "patterns": [
        "\\$_(POST|GET|COOKIE|REQUEST|SERVER|FILES)",
        "php://(input|stdin)"
    ]
}

This generates multiple grep invocations, one per pattern, which is cleaner than building increasingly complex OR clauses in your regex.

The command-line interface supports saving patterns without manually editing JSON files:

gf -save php-serialized -HnrE '(a:[0-9]+:{|O:[0-9]+:"|s:[0-9]+:")'

This creates ~/.gf/php-serialized.json with the provided flags and pattern, eliminating the friction of file creation during active audit sessions.

Perhaps the most valuable architectural decision is engine pluggability. While grep is the default, you can specify alternative search tools like the silver searcher (ag) by adding an engine field:

{
  "engine": "ag",
  "flags": "-Hanr",
  "pattern": "([^A-Z0-9]|^)(AKIA|A3T|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{12,}"
}

This means you can optimize specific patterns for speed (the README notes ag is “way faster” on large codebases) without changing your mental model or workflow. The pattern name stays the same; only the execution engine changes. Note that different engines require different flags—ag doesn’t need grep’s -E flag for extended regex, so pattern files need engine-specific flag sets.

The shell completion scripts deserve attention because they transform gf from a convenience into a cognitive aid. Instead of remembering pattern names, you type gf <tab> and see your available patterns. This turns complex regex into a browsable menu of search capabilities. The completion scripts work by reading filenames from ~/.gf/ and stripping the .json extension, making pattern discovery a native shell operation.

The pattern file structure’s simplicity enables version-controlled pattern libraries. Since patterns are just JSON files in a directory, you can version control ~/.gf/ in a repository and share patterns across teams or projects. The lack of a proprietary format or database means patterns are as portable as the files themselves.

Gotcha

gf’s minimalism is both its strength and its constraint. Installation via go get works if you have Go configured, but getting the completion scripts working requires manually sourcing bash or zsh files in your shell config. For bash, you need to add source ~/path/to/gf-completion.bash to your .bashrc. For zsh, you may need to enable autocompletion first with autoload -U compaudit && compinit, then source gf-completion.zsh. The README also notes that oh-my-zsh users may find gf aliased to git fetch, requiring either unaliasing or choosing a different command name. This manual setup means documentation and onboarding overhead for teams adopting gf.

Pattern management is entirely filesystem-based. Want to see all your patterns? List files in ~/.gf/. Want to edit one? Open the JSON file in your editor. Want to validate pattern syntax? Run it and see if it works. This works well for straightforward use cases but provides no built-in pattern organization beyond the filesystem itself.

Engine-specific flag requirements create a maintenance consideration. If you switch from grep to ag for performance, you need to adjust flags accordingly. The README example shows removing the -E flag when switching to ag, noting that “different engines use different flags” and that flags must be updated “in order for ag to successfully run.” This creates a coupling between pattern definitions and execution engines that requires awareness when switching tools.

Verdict

Use gf if you’re a security researcher, penetration tester, or anyone who repeatedly searches codebases for the same complex patterns. It excels when you have a library of patterns you invoke regularly—SQL injection indicators, credential patterns, debug code, framework-specific vulnerabilities. The tool pays for itself when you can type gf php-sources instead of reconstructing a complex grep command from memory. It’s particularly valuable when you can maintain a shared pattern library that multiple team members can use. Skip gf if you’re mostly using simple grep patterns without complex flags, if you prefer IDE-based search tools with GUI pattern builders, or if you’re not comfortable with manual shell configuration—the completion scripts require direct editing of .bashrc or .zshrc, which may not suit all workflows.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/tomnomnom-gf.svg)](https://starlog.is/api/badge-click/developer-tools/tomnomnom-gf)