Back to Articles

Building a CLI URL Safety Scanner with Google Safe Browsing API in Go

[ View on GitHub ]

Building a CLI URL Safety Scanner with Google Safe Browsing API in Go

Hook

Google's Safe Browsing API protects over 5 billion devices daily, yet most developers have never integrated it directly. This small Go tool shows you exactly how simple—and how limited—a basic implementation can be.

Context

URL reputation checking is a foundational security control that most developers take for granted. When you see Chrome's red warning page before visiting a phishing site, you're experiencing Google Safe Browsing in action. But what if you need to perform these checks programmatically—in your CI/CD pipeline, as part of a content moderation system, or within a security scanning tool?

The google_safe_browsing_cheker repository tackles this exact need: providing a command-line interface to Google's Safe Browsing API v4. While browser integration is seamless, building custom tooling requires understanding the Update API protocol, managing API keys, handling rate limits, and parsing threat type responses. This tool strips away the complexity to create a minimal viable implementation—essentially a proof of concept that demonstrates the core integration pattern without production-grade features like caching, batch processing, or sophisticated error handling.

Technical Insight

At its core, the Safe Browsing API v4 uses a lookup protocol where clients submit URLs and receive threat classifications. The API categorizes threats into types like MALWARE, SOCIAL_ENGINEERING, UNWANTED_SOFTWARE, and POTENTIALLY_HARMFUL_APPLICATION. A basic Go implementation needs three components: API client initialization, request formatting, and response parsing.

The typical integration pattern involves creating an HTTP client that authenticates with your API key and constructs proper request payloads. Here's what a minimal Safe Browsing check looks like in Go:

package main

import (
    "context"
    "fmt"
    "log"
    
    safebrowsing "github.com/google/safebrowsing"
)

func checkURL(apiKey string, urlToCheck string) error {
    // Initialize Safe Browsing client
    sb, err := safebrowsing.NewSafeBrowser(safebrowsing.Config{
        APIKey: apiKey,
        ID:     "your-client-id",
        DBPath: "/tmp/safebrowsing.db",
    })
    if err != nil {
        return fmt.Errorf("failed to create client: %w", err)
    }
    defer sb.Close()

    // Perform the lookup
    threats, err := sb.LookupURLs([]string{urlToCheck})
    if err != nil {
        return fmt.Errorf("lookup failed: %w", err)
    }

    // Parse results
    if len(threats[0]) > 0 {
        for _, threat := range threats[0] {
            fmt.Printf("THREAT DETECTED: %s\n", threat.ThreatType)
            fmt.Printf("Platform: %s\n", threat.PlatformType)
            fmt.Printf("Threat Entry: %s\n", threat.ThreatEntryType)
        }
        return nil
    }
    
    fmt.Println("URL is safe")
    return nil
}

The architecture reveals several important decisions. First, the tool must decide between the Lookup API (real-time checks, simpler but higher quota consumption) and the Update API (download threat lists locally, more complex but more efficient for bulk operations). Most lightweight CLI tools like this one favor the Lookup API for simplicity, accepting the quota limitations.

Second, there's the question of state management. The Safe Browsing protocol supports local database caching to minimize API calls—the client downloads partial hash lists and only queries the full API when local hash prefixes match. This dramatically reduces bandwidth and API quota usage, but requires persistent storage and periodic updates. A minimal CLI tool often skips this optimization, making fresh API calls for every check. This works for ad-hoc testing but becomes problematic at scale.

Third, URL normalization matters more than you'd expect. The Safe Browsing API expects canonicalized URLs following specific rules: lowercasing the hostname, removing tab/CR/LF characters, resolving relative paths, and stripping fragments. A production implementation must handle edge cases like IDN domains, percent-encoding, and redirect chains. Many simple wrappers overlook these details, leading to false negatives when malicious URLs exploit parsing inconsistencies.

The CLI interface itself is typically straightforward—accept a URL as a command-line argument, perform the check, and output results to stdout. The challenge lies in error handling and rate limiting. Google enforces quota limits (typically 500 requests per day for free tier, 10,000+ for paid), and a robust tool needs exponential backoff retry logic when hitting rate limits:

func lookupWithRetry(sb *safebrowsing.SafeBrowser, url string, maxRetries int) ([]safebrowsing.URLThreat, error) {
    var threats [][]safebrowsing.URLThreat
    var err error
    
    for attempt := 0; attempt < maxRetries; attempt++ {
        threats, err = sb.LookupURLs([]string{url})
        if err == nil {
            return threats[0], nil
        }
        
        // Check if error is rate limit
        if isRateLimitError(err) {
            backoff := time.Duration(math.Pow(2, float64(attempt))) * time.Second
            time.Sleep(backoff)
            continue
        }
        
        return nil, err
    }
    
    return nil, fmt.Errorf("max retries exceeded: %w", err)
}

Without examining the actual source code of this specific repository, it's difficult to confirm which of these patterns it implements. The lack of documentation is the first red flag—production-grade security tools need clear instructions about API key setup, quota management, and output interpretation. The single-star rating suggests minimal community review, which is concerning for a security tool where edge cases and false negatives have real consequences.

Gotcha

The biggest limitation of minimal Safe Browsing implementations is their inability to handle the API's complexity at scale. Google's quota system is unforgiving—exceed your limits and you're blocked until reset. Without local caching, every URL check consumes quota, making bulk scanning impractical. If you're checking dozens of URLs daily, you'll quickly hit free tier limits.

Another critical issue is the lack of threat context. The API returns threat types (MALWARE, SOCIAL_ENGINEERING, etc.) but doesn't provide detailed intelligence about WHY a URL is flagged, when it was added to the list, or what specific payload it's serving. For security operations, this limited context makes triage difficult. You know a URL is bad, but not whether it's a fresh C2 server or a years-old phishing page.

Finally, there's the false negative problem. Safe Browsing databases are continuously updated, but there's always a window between when a malicious site goes live and when Google indexes it. Sophisticated attackers use short-lived domains, rotating infrastructure faster than threat feeds update. Additionally, the API only checks against Google's lists—it doesn't scan URL content, execute JavaScript, or analyze page behavior. A tool like this gives you a binary safe/unsafe verdict without the nuance that real threat intelligence requires. For anything beyond basic URL filtering, you need supplementary scanning with sandboxing, content analysis, and correlation with other threat feeds.

Verdict

Use if: You need a quick proof-of-concept for Safe Browsing API integration, you're learning Go and want a simple external API project to study, or you're performing one-off checks of a handful of URLs and don't want to use the web interface. It's also useful as a starting point to understand the API before building something more robust. Skip if: You're building production security tooling, need to scan more than a few dozen URLs daily, require detailed threat intelligence beyond binary safe/unsafe classification, or need reliability guarantees with proper error handling and retry logic. The lack of documentation and minimal community validation make this unsuitable for anything where security accuracy matters. Instead, use Google's official client libraries, a mature alternative like gosb, or a comprehensive threat intelligence platform like VirusTotal that aggregates multiple sources.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/13excite-google-safe-browsing-cheker.svg)](https://starlog.is/api/badge-click/developer-tools/13excite-google-safe-browsing-cheker)