Gas Town: Git-Backed Persistence for Multi-Agent AI Workflows
Hook
Your AI coding agent just crashed halfway through a complex refactoring. When it restarts, it has zero memory of what it was doing, why, or what it already changed. Gas Town treats this amnesia as an architecture problem, not an operational nuisance.
Context
The explosion of AI coding assistants like GitHub Copilot, Claude, and GPT-4 has created a new orchestration challenge: how do you coordinate multiple AI agents working on the same codebase without them stepping on each other? More critically, how do you prevent catastrophic context loss when an agent crashes, times out, or hits rate limits?
Most agent frameworks treat persistence as an afterthought—session state lives in memory, conversation history gets truncated, and crash recovery means starting from scratch. This works fine for single-shot tasks ("generate a function that parses JSON"), but it falls apart for complex, multi-day workflows where agents need to understand what their predecessors attempted, why certain approaches failed, and what constraints were discovered along the way. Gas Town takes the opposite approach: it treats Git as the source of truth for agent state, using worktrees for isolated workspaces and structured ledgers for decision history. Every agent action—code changes, task handoffs, error escalations—gets persisted to disk in a queryable format. When an agent crashes, its replacement can literally read its predecessor's notes.
Technical Insight
Gas Town's architecture centers on a three-layer hierarchy that maps cleanly to physical resources. At the top, the Mayor is an AI coordinator that routes work and manages capacity. Below that, Rigs are project containers—each wraps a Git repository and provisions isolated workspaces. Inside each Rig, Polecats are ephemeral worker agents that execute tasks, and Hooks are Git worktree-based storage units that persist across agent lifecycles.
The persistence mechanism is where things get interesting. When a Polecat starts work, it doesn't just clone the repo—it creates a Git worktree, which is a lightweight checkout of a specific branch that shares the same .git directory as the main repository. This means multiple agents can work on different branches simultaneously without duplicating the entire repo history. When a Polecat finishes (or crashes), its workspace remains intact in the worktree. The next agent assigned to that task can inspect the exact state: uncommitted changes, build artifacts, even the agent's internal notes stored as structured JSONL logs.
Here's what a typical Molecule (task template) looks like in TOML:
[molecule]
name = "refactor-auth"
description = "Extract authentication logic into middleware"
timeout = "2h"
[convoy]
mode = "mountain" # autonomous execution without human approval
bundle_size = 3 # batch up to 3 related commits
[hooks]
workspace = "worktrees/auth-refactor"
ledger = "beads/auth-tasks"
[escalation]
p0_threshold = "compilation_failure"
p1_threshold = "test_coverage_drop > 5%"
deacon_notify = true # alert human on P0
When a Polecat picks up this Molecule, it reads the ledger (stored in Beads, a Git-backed issue tracker built on Dolt) to understand prior attempts. The ledger isn't just a list of tasks—it's a structured log of decisions, trade-offs, and failures. If a previous agent tried to use a JWT library that failed due to version conflicts, that context is queryable. The current agent can run a Seance query (Gas Town's term for querying historical agent sessions) to ask: "Why did the last three attempts at this task fail?"
The monitoring system uses a severity-based escalation model. Witness is the base monitoring layer that logs all agent actions. When something goes wrong, it classifies the issue (P0 for build breaks, P1 for test failures, P2 for warnings) and routes it to Deacon (human alerts) or Dogs (automated retry logic). This prevents agents from silently failing or getting stuck in retry loops.
For cross-instance coordination, Gas Town uses Wasteland, a federated work queue backed by DoltHub (Dolt's cloud service). Multiple Gas Town instances can push and pull work from a shared ledger, enabling distributed teams to run their own agent fleets while coordinating on shared deliverables. The Refinery component acts as a Bors-style merge queue processor—it batches commits from multiple agents, runs CI, and merges only when all checks pass.
Here's a simplified example of how state persists across agent crashes:
// Polecat picks up work from the ledger
task := beads.PullTask("auth-refactor")
workspace := hooks.LoadWorkspace(task.HookPath)
// Check if a previous agent left notes
if session := workspace.LoadSession(); session != nil {
log.Printf("Previous agent stopped at: %s", session.LastCheckpoint)
log.Printf("Reason: %s", session.TerminationReason)
// Resume from checkpoint instead of starting over
state = session.State
}
// Do work, persist progress incrementally
for step := range task.Steps {
executeStep(step)
workspace.SaveCheckpoint(step.ID, currentState)
}
// On crash, the workspace.SaveCheckpoint calls ensure the next agent
// can pick up mid-stream without re-executing completed steps
The Scheduler component manages capacity by tracking how many Polecats are active per Rig and enforcing limits (default is 20-30 concurrent agents). This prevents resource exhaustion and ensures agents don't compete for locks on the same files. Work gets queued if all slots are full, and the Mayor redistributes tasks based on Rig availability.
What makes this architecture resilient is that everything is backed by Git or Dolt. There's no central database to corrupt, no Redis instance to lose. If your entire Gas Town instance crashes, you can recreate the exact state from the Git repositories and Dolt ledgers. Agents are stateless; all state lives in version-controlled storage.
Gotcha
The dependency stack is brutal. Gas Town requires Go 1.25+, Dolt (which itself requires a MySQL-compatible setup), Beads (another layer on Dolt), tmux for session management, Git 2.25+ for worktree support, and sqlite3 for local caching. On macOS, the installation process warns about unsigned binaries and suggests manually blessing executables in System Preferences—not exactly a one-click setup. You'll need at least an hour to get everything working, and troubleshooting dependency conflicts (especially Dolt's MySQL quirks) can burn half a day.
The terminology is overwhelming. Mayor, Rigs, Polecats, Hooks, Convoys, Beads, Molecules, Wisps, Wasteland, Seance—the documentation reads like fantasy fiction, and you'll spend your first week just mapping concepts to functionality. Worse, the system is tightly coupled to Claude Code CLI as the primary agent runtime. There's "optional" support for Copilot and Codex, but the architecture clearly assumes Claude's workflow. If you want to swap in a custom agent or use a local model, you're rewriting integration layers. And all of this assumes you're paying for commercial AI subscriptions—there's no free tier that makes sense at scale.
Verdict
Use if: You're managing 4+ concurrent AI agents on multi-week projects where persistence is non-negotiable. If your agents are refactoring legacy codebases, coordinating cross-repo changes, or working on tasks that span multiple days, Gas Town's Git-backed state management justifies the complexity. It's also the right choice if you need federated coordination—multiple teams running their own agent fleets but sharing work queues. Skip if: You're doing single-agent tasks, prototyping, or want something lightweight. The installation complexity, steep learning curve, and commercial AI dependencies make Gas Town overkill for simple workflows. If you're just exploring agent orchestration or working solo, start with LangGraph or crewAI and graduate to Gas Town only when you're drowning in context loss and coordination failures. This is a power tool for production AI workflows, not a weekend experiment.