Building Plugin Systems That Don't Suck: External Agents and the Process-Based Protocol Pattern

Hook

Go's plugin system is so broken that one of the most extensible Go CLIs on GitHub doesn't use Go plugins at all—it spawns processes and talks JSON over stdin.

Context

The Entire CLI is a Git wrapper that adds time-travel semantics to your repository: checkpoint at will, experiment freely, rewind when things go sideways. It's designed for the AI-assisted development workflow where you're constantly trying speculative changes suggested by Claude, Cursor, or Copilot. But here's the problem: AI coding agents are proliferating faster than any single CLI can integrate them. Each agent has its own interface, its own session management, its own way of tracking what changed and why. Entire CLI needed a way to support arbitrary agents without becoming a monolithic integration nightmare.

The traditional solution would be a plugin system—load shared libraries at runtime, call exported functions, done. But Go's plugin system requires exact version matching down to the stdlib, breaks across Go releases, and only works on specific platforms. Even if you solve those problems, you've locked every plugin author into Go and coupled their release cycle to yours. The external-agents repository takes a radically different approach: agents are standalone binaries discovered via PATH that communicate through a documented JSON protocol over stdio. It's the Unix philosophy applied to AI tooling—small programs that do one thing well, composed through a universal interface.

Technical Insight

The protocol is brutally simple: agents are executables named entire-agent-* that accept subcommands and respond with line-delimited JSON. Discovery happens by scanning PATH when a repository's .entire.yml sets external_agents: true. No registration, no manifests, no plugin directories—just convention and filesystem scanning.

Here's what a minimal agent implementation looks like:

package main

import (
    "bufio"
    "encoding/json"
    "fmt"
    "os"
)

type Request struct {
    Command string          `json:"command"`
    Params  json.RawMessage `json:"params"`
}

type Response struct {
    Status string      `json:"status"`
    Data   interface{} `json:"data,omitempty"`
    Error  string      `json:"error,omitempty"`
}

func main() {
    if len(os.Args) < 2 {
        fmt.Fprintln(os.Stderr, "usage: entire-agent-example <subcommand>")
        os.Exit(1)
    }

    scanner := bufio.NewScanner(os.Stdin)
    encoder := json.NewEncoder(os.Stdout)

    for scanner.Scan() {
        var req Request
        if err := json.Unmarshal(scanner.Bytes(), &req); err != nil {
            encoder.Encode(Response{Status: "error", Error: err.Error()})
            continue
        }

        switch os.Args[1] {
        case "start":
            // Initialize agent session, return session ID
            encoder.Encode(Response{Status: "ok", Data: map[string]string{"session_id": "abc123"}})
        case "commit":
            // Capture what the agent did before Entire commits
            encoder.Encode(Response{Status: "ok"})
        case "transcript":
            // Return conversation history or change log
            encoder.Encode(Response{Status: "ok", Data: []string{"Added user auth", "Fixed SQL injection"}})
        default:
            encoder.Encode(Response{Status: "error", Error: "unknown subcommand"})
        }
    }
}

This architecture has profound implications. First, it's language-agnostic—agents can be written in Python, Rust, even shell scripts. The Kiro agent in this repo is Go, but nothing prevents a TypeScript implementation that wraps Cursor's API. Second, versioning is decoupled. Agents ship independently, update on their own cadence, and break changes are isolated to the JSON schema at the protocol boundary. Third, failure isolation comes for free: an agent crash can't take down the Entire CLI process, and misbehaving agents can be killed without cleanup coordination.

The testing strategy is where this design really shines. The repository includes three layers: protocol compliance tests that validate any agent speaks correct JSON regardless of implementation, agent-specific unit tests for business logic, and end-to-end lifecycle tests that orchestrate real binaries in temporary Git repos. The E2E harness is particularly clever—it scans agents/*/mise.toml at test time, builds each agent binary, injects them into PATH, then runs shared scenarios:

// Simplified from e2e/lifecycle_test.go
func TestAgentLifecycle(t *testing.T) {
    agents := discoverAgents(t) // Finds all agents/*/mise.toml
    
    for _, agent := range agents {
        t.Run(agent.Name, func(t *testing.T) {
            // Build agent binary
            binary := buildAgent(t, agent)
            
            // Create isolated test environment
            tmpDir := t.TempDir()
            setupGitRepo(t, tmpDir)
            
            // Inject agent into PATH
            env := appendPath(os.Environ(), filepath.Dir(binary))
            
            // Run Entire CLI with external_agents enabled
            cmd := exec.Command("entire", "checkpoint", "--agent", agent.Name)
            cmd.Dir = tmpDir
            cmd.Env = env
            
            output, err := cmd.CombinedOutput()
            require.NoError(t, err)
            
            // Verify agent hooks were called
            assertTranscriptCaptured(t, tmpDir, agent.Name)
        })
    }
}

This setup means adding a new agent requires only implementing the protocol and dropping a mise.toml in agents/newagent/—CI automatically discovers it, builds it, and runs the full test suite. No manual test registration, no build system changes, no CI configuration updates.

The protocol itself defines lifecycle hooks that mirror Git operations: start when beginning an agent session, commit before Entire checkpoints your work (so the agent can inject metadata about what it changed), stop when the session ends, and transcript for extracting conversation history or change explanations. The transcript capability is the killer feature—Entire CLI can aggregate logs from multiple agents, making it trivial to ask "what did Claude suggest in this branch?" or "show me all changes Cursor made today" without each tool needing custom integration.

Gotcha

The JSON-over-stdio protocol is simple, but that simplicity has costs. There's no multiplexing—if an agent needs to handle concurrent requests, it must manage that internally or spawn multiple processes. There's no cancellation mechanism—if the Entire CLI process dies, agents are orphaned and must detect broken pipes themselves. Error recovery is binary: if parsing fails or the agent hangs, the only option is to kill the process and start over. Long-running agents that maintain complex state (like a persistent language server connection) have to serialize everything through JSON on every request, which is both slow and forces a request-response model even when streaming would be more natural.

The bigger issue is adoption friction. This architecture assumes you're already using Entire CLI, which itself is a niche tool for developers who want Git-based time travel. Then you need to be using an AI coding agent that lacks native Entire integration. Then you need that agent's maintainer to implement the external-agents protocol, or you need to write and maintain a wrapper yourself. The repository includes only two agent implementations (Kiro and Amp), and with 16 GitHub stars, the ecosystem isn't exactly thriving. The protocol also lacks version negotiation—the spec assumes both sides speak the same JSON schema, so breaking changes require flag-day coordination across the Entire CLI and all agent implementations. Finally, PATH-based discovery with no checksum validation means anyone who can drop a binary named entire-agent-malicious in your PATH can intercept all agent traffic. For a tool designed to capture AI coding sessions, that's a non-trivial supply chain risk.

Verdict

Use if: You're already invested in Entire CLI's checkpoint workflow and work with multiple AI coding agents that need unified audit trails or cross-tool analytics. The process-based architecture is genuinely elegant for building language-agnostic plugin systems, and if you're designing similar extensibility mechanisms, this repository is an excellent reference implementation. The three-tier testing strategy alone is worth studying—it's a masterclass in testing integration points without compile-time coupling. Skip if: You're not using Entire CLI or only work with a single AI agent that has built-in version control (like Cursor or Aider). The protocol overhead isn't worth it for single-tool workflows, and you'll get better UX from native integrations. Also skip if you need high-frequency agent interactions or streaming responses—the JSON-over-stdio design assumes coarse-grained operations like checkpoint/commit hooks, not fine-grained request-response cycles.

Building Plugin Systems That Don't Suck: External Agents and the Process-Based Protocol Pattern

Building Plugin Systems That Don't Suck: External Agents and the Process-Based Protocol Pattern

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Building Plugin Systems That Don't Suck: External Agents and the Process-Based Protocol Pattern

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

LobeHub: The Agent Orchestration Platform That Treats AI as Your Employee, Not Your Chatbot

OpenSRE: Building the SWE-bench for Production Incidents

Agent Orchestrator: Git Worktrees Are the Secret to Parallel AI Coding

OpenSandbox: Building Production-Grade Isolation for AI Agents That Actually Execute Code

LobeHub: The Agent Orchestration Platform That Treats AI as Your Employee, Not Your Chatbot

OpenSRE: Building the SWE-bench for Production Incidents

Agent Orchestrator: Git Worktrees Are the Secret to Parallel AI Coding

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]