> your AI agent picks dependencies from memory; give it dated facts — try starlog.dev ↗ vet your agent's deps ↗ vibe-coding is fine. vibe-importing isn’t. — try starlog.dev ↗ vibe-importing isn’t fine ↗ your agent has never seen your private packages — try starlog.dev ↗ facts for private packages ↗ a linter for the dependencies your AI agent picks — try starlog.dev ↗ a linter for agent deps ↗

Back to Articles

How Minefield Uses Roaring Bitmaps to Query 10,000 SBOMs in Seconds

[ View on GitHub ]

How Minefield Uses Roaring Bitmaps to Query 10,000 SBOMs in Seconds

Hook

Most SBOM tools query dependency graphs like it's 2005—traversing nodes one edge at a time. Minefield treats your entire supply chain as compressed bitmaps, answering "which 5,000 packages depend on log4j?" before you finish typing the query.

Context

Software Bills of Materials (SBOMs) have exploded from compliance checkbox to critical infrastructure. Every container image, every binary artifact, every cloud deployment now ships with a manifest listing its thousands of transitive dependencies. Security teams need to answer urgent questions: Which of our 10,000 services use the vulnerable crypto library? What packages do our top 50 applications have in common? Which dependency appears most frequently across our entire portfolio?

Traditional graph databases handle these queries by walking edges—following pointers from node to node, recursively expanding dependency trees. This works fine for dozens of SBOMs but becomes prohibitively slow at scale. Air-gapped environments face additional constraints: no cloud APIs, no external graph services, no reliance on vendor infrastructure. Minefield emerged from this gap, applying a technique borrowed from search engines and analytics databases: representing graph relationships as compressed bitmaps, where complex traversals become blazing-fast bitwise operations.

Technical Insight

Minefield's core insight is that SBOM queries are fundamentally set operations disguised as graph traversals. When you ask "which packages depend on library X," you're really asking "give me the intersection of all packages and the set of X's dependents." Roaring bitmaps excel at exactly this workload.

Roaring bitmaps compress integer sets by partitioning them into 16-bit chunks and selecting optimal encoding per chunk—arrays for sparse data, bitmaps for dense data, run-length encoding for consecutive integers. This hybrid approach delivers both compact storage and fast operations. In Minefield's architecture, each package gets a unique integer ID, and dependencies are stored as bitmap sets. If package 42 depends on packages [1, 5, 99, 100, 101], that relationship compresses to a roaring bitmap requiring just bytes instead of a full adjacency list.

The ingestion pipeline converts standard SBOM formats (CycloneDX, SPDX) into this bitmap representation. Here's how a simplified query might look in Minefield's architecture:

// Find all packages that depend on "log4j"
log4jID := cache.GetPackageID("log4j")
dependents := cache.GetDependents(log4jID)

// Internally, this is a bitmap retrieval - O(1) lookup
// dependents is a roaring.Bitmap containing package IDs

// Now find common dependencies between two packages
pkgA := cache.GetDependencies(packageA_ID)
pkgB := cache.GetDependencies(packageB_ID)
common := roaring.And(pkgA, pkgB)

// Bitmap intersection runs in microseconds
// even with millions of dependencies
for common.HasNext() {
    depID := common.Next()
    fmt.Println(cache.GetPackageName(depID))
}

The real power emerges with transitive queries. Finding all transitive dependencies traditionally requires recursive graph traversal—O(V+E) complexity that degrades with dependency depth. With bitmaps, you iteratively compute bitmap unions:

func GetTransitiveDependencies(pkgID uint32, cache *Cache) *roaring.Bitmap {
    result := roaring.New()
    frontier := cache.GetDependencies(pkgID)
    
    for !frontier.IsEmpty() {
        // Add current frontier to results
        result.Or(frontier)
        
        // Find dependencies of all frontier packages
        nextFrontier := roaring.New()
        iter := frontier.Iterator()
        for iter.HasNext() {
            depID := iter.Next()
            nextFrontier.Or(cache.GetDependencies(depID))
        }
        
        // Remove already-seen packages to avoid cycles
        nextFrontier.AndNot(result)
        frontier = nextFrontier
    }
    
    return result
}

This breadth-first expansion using bitmap unions processes entire dependency levels in parallel rather than one edge at a time. The benchmark claims 10,000 SBOMs cached in 30 seconds weren't exaggerating—bitmap operations on modern CPUs leverage SIMD instructions and achieve throughput measured in gigabits per second.

Minefield's architecture separates ingestion from querying. The server maintains an in-memory cache of bitmap indices, persisted to disk between sessions. When you ingest new SBOMs, the cache rebuilds its indices—a one-time cost that enables unlimited subsequent queries. The query DSL exposes operations like "dependency depth," "shared components," and "most depended upon," all implemented as bitmap primitives.

The air-gap design means zero network calls during operation. Everything—SBOM parsing, bitmap indexing, query execution—runs locally. For regulated industries, defense contractors, or critical infrastructure operators who can't route dependency data through external services, this locality is non-negotiable. The bitmap approach also enables surprising queries: "Show me all packages appearing in exactly 5 of my SBOMs" becomes a bitmap cardinality filter, computed in milliseconds across millions of components.

Gotcha

The elephant in the room: Minefield is archived and unmaintained. The last commit was months ago, issues are closed without resolution, and no roadmap exists. For production use, this is disqualifying. Security tools require active maintenance—SBOM formats evolve, new vulnerability databases emerge, and bugs in dependency resolution can have compliance implications. Using abandoned software for supply chain security is ironic at best, negligent at worst.

Even setting aside maintenance concerns, the architecture has operational friction. Cache rebuilding after every SBOM ingestion means you can't continuously stream updates—you ingest in batches, rebuild indices, then query. For CI/CD pipelines generating hundreds of SBOMs daily, this batch-oriented workflow creates latency. The project also lacks observability: no metrics export, no query performance logging, no debugging tools for understanding why a particular SBOM failed to parse. You're flying blind when things break, and with no community support, you're debugging alone. The bitmap approach also consumes memory proportional to package cardinality—10,000 SBOMs with unique packages could require gigabytes of RAM for the in-memory cache, limiting deployment options on resource-constrained systems.

Verdict

Use if: You're researching bitmap-based graph architectures, building air-gapped SBOM tooling and need a reference implementation, or prototyping offline dependency analysis where performance matters more than production-readiness. The codebase demonstrates how to apply roaring bitmaps to supply chain problems effectively, and for experimental environments or academic research, the archived status is acceptable. Skip if: You need production-grade SBOM analysis, require ongoing security updates, expect community support or bug fixes, or operate in regulated environments requiring vendor-backed tooling. The archived repository disqualifies Minefield for any serious use. Instead, adopt Dependency-Track for comprehensive SBOM management, GUAC for graph-based supply chain intelligence with active CNCF backing, or build on Syft/Grype for a maintained open-source foundation. Minefield's bitmap innovation deserves attention, but its unmaintained state demands you look elsewhere for actual deployment.