Inside Cucumber's Polyglot Architecture: How One Team Coordinates 20+ Repositories Across 15 Languages

Hook

The cucumber/common repository has over 3,000 stars but contains almost no code. Instead, it's the nerve center for one of the most ambitious polyglot open-source projects in existence: maintaining feature-identical BDD testing tools across 15+ programming languages.

Context

Before cucumber/common existed, the Cucumber ecosystem faced a coordination nightmare. With separate implementations in Ruby, Java, JavaScript, Go, and a dozen other languages, bug reports and feature requests were scattered across repositories. A developer finding inconsistent behavior between cucumber-js and cucumber-jvm had nowhere to file an issue that affected both. Worse, architectural decisions that should have been coordinated—like how to represent test results or parse Gherkin syntax—were made in isolation, leading to diverging implementations.

The problem wasn't unique to Cucumber. Any framework attempting true polyglot support faces the same challenge: how do you maintain conceptual consistency across radically different runtime environments while keeping each implementation idiomatic to its language? Most projects solve this by picking a primary language and treating others as second-class citizens. Cucumber took a different approach: create a governance structure that treats all language implementations as equals, with cucumber/common serving as the coordination point for cross-cutting concerns, architectural decisions, and the shared protocols that enable interoperability.

Technical Insight

System architecture — auto-generated

The cucumber/common repository reveals a sophisticated approach to managing complexity in polyglot projects. Rather than containing implementation code, it functions as a meta-layer coordinating several focused libraries, each replicated across multiple languages. The architecture centers on three key strategies: message-based protocols, language-agnostic specifications, and centralized issue triage.

At the heart of this architecture is the Cucumber Messages protocol—a JSON-based schema that defines how all Cucumber components communicate. Instead of each language implementation creating its own internal data structures, they all serialize to and from a common message format. Here's what a snippet of a Cucumber Message looks like:

{
  "testStepFinished": {
    "testStepId": "step-123",
    "testCaseStartedId": "case-456",
    "testStepResult": {
      "status": "PASSED",
      "duration": {
        "seconds": 0,
        "nanos": 245000000
      }
    },
    "timestamp": {
      "seconds": 1638360000,
      "nanos": 0
    }
  }
}

This protocol-first approach means a formatter written in Ruby can consume test results from cucumber-jvm, or a reporter in Go can process output from cucumber-js. The implications are profound: tooling can be written once and work across the entire ecosystem. A CI dashboard doesn't need Java-specific, JavaScript-specific, and Ruby-specific integrations—it just consumes Cucumber Messages.

The second architectural pillar is language-agnostic specifications stored as executable tests. The Gherkin parser, for instance, has a single canonical test suite written in Gherkin itself, with expected AST outputs defined in the message format. Each language implementation must pass this identical suite:

Feature: Gherkin Parser
  Scenario: Parse a simple feature
    Given a Gherkin document:
      """
      Feature: Hello
        Scenario: World
          Given a step
      """
    When it is parsed
    Then the AST should match:
      """
      {"feature": {"name": "Hello", "children": [...]}}
      """

This test-driven coordination ensures behavioral consistency without dictating implementation details. The Java version might use ANTLR for parsing while the JavaScript version uses a hand-rolled parser—but both must produce identical output for identical input.

The third strategy is sophisticated issue triage. The cucumber/common repository serves as a first-contact point where maintainers can tag issues with labels like "affects: gherkin", "affects: cucumber-expressions", or "affects: multiple". Issues are then migrated to specific component repositories when appropriate, or remain in common when they represent architectural decisions affecting the whole ecosystem. This prevents duplicate issues across repos and ensures cross-cutting concerns get visibility from the entire maintainer team.

The tag-expressions library exemplifies how this coordination enables sophisticated features. It provides a boolean query language for selecting tests based on tags—essentially a mini programming language that needed identical semantics across all Cucumber implementations. Rather than having each language team independently interpret requirements like "how should @mobile and not @slow be evaluated?", the specification was defined once in cucumber/common, implemented with shared conformance tests, and rolled out consistently.

This architecture pattern—centralized coordination with decentralized implementation—could be applied to any polyglot framework. The key insight is treating protocol design as a first-class architectural concern, not an afterthought. By investing in message schemas and language-agnostic tests upfront, the Cucumber team created a forcing function for consistency that scales across languages and teams.

Gotcha

The biggest limitation of cucumber/common is that it's frequently misunderstood. New users searching for "cucumber github" often land here expecting documentation, getting started guides, or implementation code. Instead, they find a repository that's primarily organizational infrastructure. The lack of language tags and minimal README for newcomers creates a confusing first impression. If you're trying to actually use Cucumber in your project, this is the wrong repository—you need cucumber-js, cucumber-jvm, cucumber-ruby, or whichever language-specific implementation matches your stack.

More subtly, this coordination model introduces overhead that small projects might not need. The message protocol adds serialization costs, the centralized issue tracking requires maintainer time for triage, and the conformance testing suite means changes to core libraries require updates across 15+ language implementations. For a framework with Cucumber's scale and adoption, these tradeoffs make sense. But if you're building a new polyglot tool, consider whether you actually need this level of coordination or if a primary implementation with community-maintained ports would suffice. The cucumber/common approach is powerful but heavyweight—it's optimized for mature ecosystems where consistency across languages is mission-critical, not for early-stage projects still finding product-market fit.

Verdict

Use if: You're contributing to the Cucumber ecosystem and have an issue that affects multiple components (like proposing a new Gherkin keyword that impacts parsers, formatters, and documentation), you're studying polyglot architecture patterns for your own multi-language framework, or you're building tooling that needs to understand Cucumber's cross-language protocols and want to track architectural discussions. Skip if: You're trying to use Cucumber in a project (head directly to your language-specific implementation), you want to contribute code rather than coordinate issues (go to the specific component repo like gherkin or cucumber-expressions), or you're researching BDD frameworks and want to evaluate Cucumber's features (check the main documentation sites instead). This repository is infrastructure for maintainers and ecosystem contributors, not end users.

Inside Cucumber's Polyglot Architecture: How One Team Coordinates 20+ Repositories Across 15 Languages

Inside Cucumber's Polyglot Architecture: How One Team Coordinates 20+ Repositories Across 15 Languages

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Inside Cucumber's Polyglot Architecture: How One Team Coordinates 20+ Repositories Across 15 Languages

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]