Back to Articles

EarlyBird: American Express's Pattern-Based Secret Scanner for Self-Hosted Security

[ View on GitHub ]

EarlyBird: American Express’s Pattern-Based Secret Scanner for Self-Hosted Security

Hook

American Express open-sourced their internal secret scanner that stores all configuration in your home directory for self-hosted security scanning.

Context

Hardcoded credentials are the software equivalent of hiding your house key under the doormat. The MITRE Corporation catalogs this as CWE-798, and it’s responsible for countless breaches—from AWS key leaks to API tokens committed to public GitHub repositories. Traditional approaches involve manual code reviews or relying on third-party SaaS platforms that scan your proprietary code on their infrastructure. American Express needed something different: a self-hosted tool that could run locally, integrate into pre-commit workflows, and give security teams complete control over detection rules.

EarlyBird emerged from this requirement. Written in Go, it’s designed to scan source code repositories for clear text password violations, PII, outdated cryptography methods, and key files. The tool uses a pattern-matching approach with configurable modules and labels. It can analyze local directories, remote git repositories, or run as a REST API server. The tool represents a fundamentally different philosophy: security scanning as infrastructure you control, not a service you subscribe to.

Technical Insight

scan path/git URL

rules, ignores, FP lists

file content

CWE patterns

findings

filtered results

trigger

Sources

Local Directory

Remote Git Repo

Pre-commit Hook

REST API

CLI Entry Point

Config Loader

~/.go-earlybird

File Scanner

Rule Engine

Modules + Labels

Pattern Matcher

Results Processor

Output Report

System architecture — auto-generated

EarlyBird’s architecture centers on two core concepts: modules and labels. Modules define what to look for—specific patterns, file types, or code constructs. Labels categorize findings by severity and type. When you run the install script, it creates a .go-earlybird directory in your home folder containing all configuration files. This design choice means rules, false positive lists, and ignore patterns live alongside your user profile, making them portable across projects but persistent across sessions.

The scanning workflow is straightforward. For a local directory scan:

go-earlybird --path=/path/to/your/project

For remote repositories:

go-earlybird --git=https://github.com/americanexpress/earlybird

The tool processes files against its rule engine, matching patterns defined in module configurations. Each module targets specific CWEs—the README documents coverage of CWE-798 (hardcoded credentials), CWE-259 (hardcoded passwords), CWE-321 (hardcoded cryptographic keys), CWE-257 (recoverable password storage), CWE-312 (cleartext storage), CWE-327 (broken cryptography), CWE-338 (weak PRNG), CWE-615 (information exposure through comments), CWE-546 (suspicious comments), and CWE-521 (weak password requirements).

The extensibility model is where EarlyBird differentiates itself. You can create custom modules tailored to your organization’s patterns. If your company uses a proprietary API key format, you write a module that matches that specific pattern. If you’re bound by GDPR and need to catch European phone numbers or social security equivalents, you extend the PII detection labels. The documentation explicitly includes guides for creating modules and labels, acknowledging that generic rules may miss organization-specific secrets.

False positive management is built into the architecture. The tool maintains false positive lists in the configuration directory, allowing teams to mark patterns that trigger rules but aren’t actually vulnerabilities. This is useful for real-world scenarios where variable names like password_hash or api_key_validator legitimately contain sensitive keywords but aren’t exposing secrets. The ignore functionality extends to line-level and file-level exclusions, useful for test fixtures or documentation that intentionally shows example credentials.

Deployment flexibility spans multiple modes. As a CLI tool, it fits into CI/CD pipelines with simple shell commands. As a pre-commit hook, it appears designed to catch secrets before they enter version control—the documentation provides explicit hook setup instructions. The README also mentions REST API functionality, suggesting it can operate as a service.

Gotcha

EarlyBird’s architecture appears to be pattern-matching based, which means it relies on predefined rules and regular expressions to identify secrets. This approach provides precision when patterns are known but may have limitations with obfuscated or novel secret formats. The tool’s GitHub presence shows 758 stars, indicating a smaller community compared to some alternatives in the space. The repository documentation doesn’t indicate extensive third-party integrations or plugins, suggesting teams may need to build custom integrations for specific workflows. Organizations should evaluate whether the self-hosted, configuration-driven approach aligns with their security scanning requirements and available resources for customization.

Verdict

Use EarlyBird if you require self-hosted secret scanning with no external dependencies, need fine-grained control over detection rules through custom modules, or work in regulated environments where code cannot be sent to third-party services. It’s appropriate for organizations that have Go expertise and want to customize the tool for proprietary secret formats. The American Express backing provides credibility, and the CWE-aligned approach ensures coverage of industry-standard weaknesses. The tool is particularly well-suited for teams that value configuration control and can invest in creating custom modules. Consider alternatives if you need a more feature-rich ecosystem with extensive pre-built integrations, or if you prefer solutions with larger community support. EarlyBird is source code-focused, making it ideal for repository scanning but less suitable for other scanning contexts.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/data-knowledge/americanexpress-earlybird.svg)](https://starlog.is/api/badge-click/data-knowledge/americanexpress-earlybird)