Multica: Building Software Teams Where AI Agents Are Permanent Employees
Hook
What if your AI agents didn't just execute tasks, but actually remembered what they learned, claimed ownership of work, and reported blockers during standup—just like human teammates?
Context
The current generation of coding agents suffers from a fundamental identity crisis. Tools like GitHub Copilot and Claude Code are brilliant one-shot assistants, but they're stateless mercenaries—you invoke them, they generate code, then they forget everything. There's no continuity, no accumulation of knowledge, and no sense of ownership. If you're managing a team that uses three different agent CLIs, you're context-switching between interfaces, manually routing tasks to the right tool, and losing all the solved problems in the void between sessions.
Multica emerged from a different philosophy: what if agents were treated as first-class citizens in your task management system? Not tools you invoke, but teammates with profiles, persistent skills, and autonomous lifecycles. The platform draws inspiration from the Multics operating system's time-sharing principles—multiplexing human and AI workers so a small engineering team can operate at exponentially larger scale. Instead of building yet another wrapper around GPT-4, Multica creates infrastructure for agent permanence, skill compounding, and work delegation at the organizational level.
Technical Insight
Multica's architecture splits into three layers: a Next.js web dashboard for task assignment, a Go backend with Chi router and WebSocket support for real-time coordination, and local CLI daemons that run on developer machines to execute work. The genius is in the daemon design—it auto-detects which agent CLIs you have installed (Claude, Codex, GitHub Copilot, etc.) and registers them as available "runtimes" with the central server. This creates a vendor-neutral abstraction layer where the platform doesn't care whether you're running Anthropic's or OpenAI's agent underneath.
Here's how a typical task flow works. You assign a ticket to an agent named "Backend-Agent-01" through the web UI. The backend receives the assignment, checks which daemon has the appropriate runtime (say, Claude Code CLI), and streams the task specification via WebSocket to that daemon. The daemon invokes the local agent CLI, captures stdout/stderr, and streams progress updates back to the server in real-time. When the agent completes the task, the solution gets stored in PostgreSQL with pgvector embeddings—this is where skill compounding happens. The next time a similar task arrives, the system can surface the previous solution as context.
The daemon registration logic looks something like this:
// Simplified daemon runtime detection
class AgentDaemon {
private runtimes: Map<string, AgentRuntime> = new Map();
async detectRuntimes() {
const detectors = [
{ name: 'claude-code', command: 'claude --version' },
{ name: 'codex', command: 'codex --version' },
{ name: 'copilot-cli', command: 'github-copilot-cli --version' }
];
for (const detector of detectors) {
try {
await exec(detector.command);
this.runtimes.set(detector.name, new AgentRuntime(detector.name));
console.log(`Registered runtime: ${detector.name}`);
} catch (error) {
// CLI not installed, skip
}
}
// Report available runtimes to backend
await this.reportCapabilities();
}
async executeTask(task: Task): Promise<TaskResult> {
const runtime = this.runtimes.get(task.requiredRuntime);
if (!runtime) throw new Error(`Runtime ${task.requiredRuntime} not available`);
const stream = runtime.execute(task.specification);
// Stream progress via WebSocket
for await (const chunk of stream) {
this.websocket.send({
taskId: task.id,
type: 'progress',
data: chunk
});
}
return runtime.getResult();
}
}
The Squads feature adds a critical routing layer for team coordination. Instead of manually assigning every task to a specific agent, you create a squad with a designated leader agent. When you assign work to the squad, the leader analyzes the task, breaks it into subtasks, and delegates to squad members based on their accumulated skills and current availability. This mirrors how engineering managers distribute work—except the manager is also an AI agent. The leader maintains a routing table of member capabilities (tracked via their solution history) and can even re-route work mid-execution if an agent reports blockers.
Skill compounding is implemented through a vector similarity search pipeline. When an agent completes a task, Multica generates embeddings from the task description, solution code, and any documentation produced. These vectors get stored in pgvector alongside the full solution artifacts. For new tasks, the system performs a cosine similarity search to find the top-k most relevant previous solutions, then injects them into the agent's context as examples. Over weeks and months, this creates a team knowledge base that dramatically improves agent effectiveness on domain-specific problems—the agents literally get better at your codebase over time.
The workspace-level multi-tenancy model ensures team isolation. Each workspace gets its own PostgreSQL schema, skill database, and agent roster. This is crucial for organizations running multiple product teams—you don't want your frontend squad's React patterns polluting your backend squad's database optimization skills. The backend uses row-level security policies to enforce workspace boundaries at the database level, preventing cross-contamination even if the application logic has bugs.
Gotcha
The elephant in the room is autonomous execution safety. Multica gives agents the ability to claim tasks and execute code without explicit human approval for each step. The README doesn't address what happens when an agent generates malicious code, enters an infinite loop, or makes incorrect assumptions that cascade into broken features. In production environments, this autonomous execution model requires robust sandboxing (Docker containers, VM isolation) and comprehensive output validation—none of which the platform currently provides out of the box. You're responsible for building those guardrails yourself.
The dependency on external agent CLIs creates a fragile setup experience. Before Multica can do anything useful, you need to install and configure Claude Code, Codex, or Copilot on the machine running the daemon. Each CLI has its own authentication flow, API quota limits, and version compatibility requirements. If Anthropic ships a breaking change to Claude's CLI interface, your daemon might silently fail to detect the runtime. The platform would benefit from bundled agent runtimes or at least health check endpoints that verify CLI compatibility. Additionally, the skill compounding system's effectiveness is directly tied to task volume—a new team won't see benefits until they've accumulated dozens of solved problems, creating a cold-start problem for adoption.
Verdict
Use Multica if you're a small engineering team (2-10 developers) already using multiple AI coding assistants and struggling with context-switching overhead, or if you want to experiment with agents as persistent team members rather than disposable tools. The platform shines when you need work delegation via Squads, skill accumulation across similar tasks, and unified orchestration of heterogeneous agent runtimes. It's particularly compelling for teams working on long-term codebases where domain knowledge compounds. Skip it if you need production-grade safety guarantees for autonomous code execution, prefer the simplicity of invoking agents ad-hoc without infrastructure overhead, or aren't prepared to manage the complexity of local daemon deployments and multiple CLI installations. Also avoid if you're a solo developer or large enterprise—the former won't benefit from team coordination features, the latter will hit scalability unknowns given the project's early maturity.