Back to Articles

Inside Siderolabs' RFC Process: How Infrastructure Teams Document Breaking Changes

[ View on GitHub ]

Inside Siderolabs' RFC Process: How Infrastructure Teams Document Breaking Changes

Hook

Most RFC repositories have hundreds of stars and dozens of active proposals. Siderolabs' has three stars and serves as a quiet example of governance-as-code for infrastructure teams who ship breaking changes to production Kubernetes clusters.

Context

Request for Comments (RFC) processes originated in the 1960s with ARPANET, but modern software teams have adapted the format for internal decision-making. When you're building an immutable operating system like Talos Linux that manages production Kubernetes clusters, you can't afford to ship breaking API changes or architectural pivots without documentation. Unlike application code where you can iterate rapidly, infrastructure decisions have multi-year lifespans and affect thousands of nodes simultaneously.

Siderolabs maintains this RFC repository as a lightweight governance layer for their Talos Linux ecosystem. While heavyweight projects like Kubernetes use KEPs (Kubernetes Enhancement Proposals) with extensive templates and tracking systems, Siderolabs opts for a minimalist approach: markdown files in a GitHub repository, reviewed through pull requests. This reflects a pragmatic tension in infrastructure engineering—you need enough process to prevent catastrophic mistakes, but not so much that you can't ship improvements at a reasonable velocity.

Technical Insight

The siderolabs/rfcs repository follows a file-based architecture where each proposal exists as a standalone markdown document. Based on common RFC patterns, the structure likely includes a front matter section with metadata (RFC number, status, authors) followed by sections covering motivation, proposal details, alternatives considered, and implementation phases. Pull requests serve as the review mechanism, turning GitHub's existing collaboration features into a decision-making workflow.

A typical RFC submission workflow would look like this:

# Clone the repository
git clone https://github.com/siderolabs/rfcs.git
cd rfcs

# Create a new RFC following the numbering convention
cp template.md rfcs/0042-new-feature.md

# Edit your proposal
vim rfcs/0042-new-feature.md

# Commit and push
git checkout -b rfc-0042-new-feature
git add rfcs/0042-new-feature.md
git commit -m "RFC 0042: Proposal for new feature"
git push origin rfc-0042-new-feature

# Open pull request for team review

The RFC document itself would follow a structure optimized for architectural decision capture:

# RFC 0042: Network Policy Implementation

- Status: Draft
- Authors: @engineering-lead
- Date: 2024-01-15

## Summary
Implement native network policy enforcement in Talos Linux
without requiring Calico or Cilium as external dependencies.

## Motivation
Users deploying edge Kubernetes clusters need minimal
container footprints. Current network policy solutions add
300MB+ to node images and increase attack surface.

## Proposal
Integrate eBPF-based filtering directly into machined,
leveraging kernel 6.x capabilities for packet filtering.

### Implementation Phases
1. eBPF program loader in machined (Sprint 1-2)
2. API extensions for policy CRDs (Sprint 3)
3. CNI plugin modifications (Sprint 4)

## Alternatives Considered
- Continuing to require external CNI plugins
- Implementing using iptables (rejected due to performance)
- Using nftables (incompatible with immutable filesystem)

What makes this approach effective for infrastructure teams is the permanence of the decision record. Unlike Slack discussions or Confluence documents that become stale, RFCs in version control remain accessible indefinitely. When an engineer encounters confusing code in Talos Linux three years later, they can git blame back to an RFC that explains why a particular design choice was made—including the alternatives that were rejected and the constraints that existed at the time.

The lightweight nature also reduces friction for internal proposals. Siderolabs likely uses this for mid-sized changes that are too significant for a simple pull request but don't warrant formal specification documents. For example, changing the default container runtime, modifying the upgrade protocol, or altering the machine configuration API would all benefit from RFC treatment. The low star count (3) indicates this is primarily a working repository for the core team rather than a community participation mechanism—which is appropriate for a company shipping infrastructure that prioritizes stability over democratic decision-making.

The repository also serves as institutional memory. When team members leave or new engineers join, RFCs provide context that README files and inline comments cannot capture. They document the problem space, stakeholder concerns, performance considerations, and security implications in a structured format. This becomes critical for infrastructure projects where changes have cascading effects across distributed systems.

Gotcha

The primary limitation of the siderolabs/rfcs repository is its opacity to external contributors. With no README, no contribution guidelines, and no visible RFC template, community members attempting to propose changes face significant friction. You'd need to reverse-engineer the expected format from existing RFCs—assuming any are publicly visible. This isn't necessarily a flaw if the repository is intentionally internal-facing, but it creates a barrier for the open-source community that might want to participate in Talos Linux's evolution.

The minimal activity (3 stars, limited discoverability) also suggests that important decisions might be happening elsewhere—in private discussions, company Slack channels, or email threads. If RFCs are created retroactively to document decisions already made, they lose much of their value as collaboration tools. The repository becomes documentation theater rather than genuine governance. Additionally, without clear acceptance criteria or status tracking, it's unclear when an RFC transitions from proposal to accepted design to implemented feature. This ambiguity makes it difficult for external developers to know which RFCs represent the current architectural direction versus abandoned ideas.

Verdict

Use if: You're a Siderolabs employee or core contributor who needs to propose architectural changes to Talos Linux, or you're researching how minimal governance models work for infrastructure projects. The repository provides value as a study in lightweight decision documentation for small, focused engineering teams. Skip if: You're looking for a template to build your own RFC process (look at Rust RFCs instead), you want to understand current Talos Linux architecture (consult the main documentation), or you're hoping to participate in community-driven design discussions. The repository's value is in its existence as a pattern, not as a tool you'll directly interact with unless you're inside Siderolabs' engineering organization.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/siderolabs-rfcs.svg)](https://starlog.is/api/badge-click/developer-tools/siderolabs-rfcs)