Back to Articles

Building an AI Agency from Prompt Templates: Inside Agency-Agents

[ View on GitHub ]

Building an AI Agency from Prompt Templates: Inside Agency-Agents

Hook

A repository containing nothing but markdown files and shell scripts has accumulated over 31,000 stars—more than many production frameworks. The reason? It solved a problem most developers didn’t realize they had: generic AI assistants that lack domain expertise.

Context

The explosion of AI coding assistants in 2023-2024 created a paradox: developers suddenly had powerful language models at their fingertips, but found themselves spending significant time explaining context, defining deliverables, and re-specifying requirements across conversations. Each new chat session meant starting from scratch. Each tool switch meant losing carefully crafted instructions.

The agency-agents repository emerged as a response to this friction. Rather than treating AI assistants as blank slates, it packages specialized expertise into reusable prompt templates—what the creator calls ‘agents.’ Each agent is a markdown file defining a specific role: frontend developer, technical writer, Reddit community manager, or accessibility auditor. These aren’t actual AI models or fine-tuned systems; they’re meticulously crafted system prompts that transform a general-purpose LLM into a specialized consultant. The viral adoption suggests developers were craving exactly this: pre-configured expertise they could activate with a single command rather than prompt iteration.

Technical Insight

Cursor

Aider

Claude

Windsurf

Agent Markdown Templates Role / Workflow / Deliverables

Template Parser Markdown Processing

Select Target Tool

Cursor Rules .cursorrules files

Aider Prompts System prompt files

Claude Projects Custom Instructions

Windsurf Config Assistant rules

AI Assistant Specialized Behavior

System architecture — auto-generated

The architecture is deceptively simple but reveals sophisticated prompt engineering patterns. Each agent markdown file follows a consistent structure: personality and mission, specific workflows, concrete deliverables, and communication style. Here’s what a simplified agent definition looks like:

# Frontend Wizard Agent

## Mission
You are a senior frontend developer specializing in React, TypeScript, and modern CSS.
Your goal is to deliver production-ready UI components with accessibility baked in.

## Workflow
1. Clarify requirements and gather design specifications
2. Propose component architecture and state management approach
3. Implement with TypeScript, proper prop typing, and semantic HTML
4. Include accessibility attributes (ARIA labels, keyboard navigation)
5. Provide usage examples and edge case handling

## Deliverables
- Fully typed React components
- Storybook stories demonstrating variants
- Unit tests covering key interactions
- Performance considerations documented

## Communication Style
Direct and pragmatic. Ask clarifying questions upfront. Explain tradeoffs when proposing solutions.

The magic happens in the installation scripts. The repository includes shell scripts that convert these markdown templates into format-specific configuration files for different AI tools. For Cursor, it generates .cursorrules files. For Claude, it creates custom project instructions. For Aider, it produces system prompt files. Here’s the core conversion logic:

#!/bin/bash
# Simplified version of the installation script

AGENT_NAME=$1
TOOL=$2
AGENT_FILE="agents/${AGENT_NAME}.md"

if [ "$TOOL" == "cursor" ]; then
    TARGET="$HOME/.cursor/rules/${AGENT_NAME}.cursorrules"
    cp "$AGENT_FILE" "$TARGET"
elif [ "$TOOL" == "aider" ]; then
    TARGET="$HOME/.aider/prompts/${AGENT_NAME}.txt"
    # Strip markdown formatting, keep content
    sed 's/^# //g; s/^## /\n/g' "$AGENT_FILE" > "$TARGET"
elif [ "$TOOL" == "claude" ]; then
    echo "Copy the following to Claude Projects > Custom Instructions:"
    cat "$AGENT_FILE"
fi

echo "Agent ${AGENT_NAME} installed for ${TOOL}"

What makes these templates effective isn’t just the domain knowledge—it’s the structure. Each agent explicitly defines success criteria, which gives the LLM concrete targets. The ‘Frontend Wizard’ doesn’t just write React code; it knows it should deliver TypeScript types, consider accessibility, and document performance implications. This specificity reduces ambiguity and produces more consistent outputs.

The repository also demonstrates prompt engineering anti-patterns to avoid. Early versions of some agents were too verbose, stuffing the system prompt with examples that consumed valuable context window. The current versions are concise but comprehensive, focusing on frameworks and principles rather than exhaustive examples. They tell the LLM how to think about problems rather than what to output verbatim.

Another architectural decision worth highlighting: agents are designed to be composable. You might activate the ‘Technical Writer’ agent when documenting code produced by the ‘Backend API’ agent. This workflow mirrors how actual agencies operate—specialists collaborating on different phases of a project. The repository doesn’t enforce this composition (it’s just markdown files after all), but the consistent structure across agents enables it.

The shell-based tooling reveals both pragmatism and limitation. Shell scripts work everywhere developers work, require no dependencies, and are easily auditable. But they’re also brittle—different AI tools evolve their configuration formats, and the scripts require manual updates to stay current. Some tools like Windsurf or newer Cursor versions may have different directory structures or configuration approaches that break the assumptions.

Gotcha

The fundamental limitation is that these are sophisticated instructions, not actual capabilities. An agent template can tell Claude to ‘write production-ready Kubernetes manifests,’ but if the underlying LLM doesn’t deeply understand Kubernetes, you’ll get confident-sounding garbage. The agents are only as good as the model executing them—GPT-4 running the ‘Security Auditor’ agent will produce vastly different results than GPT-3.5 with the same prompt.

There’s also no runtime validation or feedback loop. If an agent produces incorrect output, there’s no mechanism to automatically correct course. You’re still relying on the LLM’s ability to follow instructions consistently, which varies based on model temperature, prompt length, and factors outside your control. A traditional software agent might validate its output against a schema or run tests; these prompt-based agents have no such guardrails.

The installation experience is rough around the edges. The shell scripts assume standard directory structures that may not match your actual tool configurations. If you’ve customized where Cursor stores rules or use a different Aider setup, you’ll need to manually edit paths. Cross-platform issues (looking at you, Windows with WSL vs PowerShell) can create friction. This isn’t a polished CLI tool with error handling—it’s a collection of utility scripts that expect a relatively standard development environment.

Verdict

Use if: You regularly work with AI coding assistants and find yourself re-explaining the same context across sessions or tools. The agents provide battle-tested prompt structures for common roles (frontend, backend, DevOps, technical writing) that you can activate instantly. Particularly valuable if your team wants consistent AI assistant behavior—everyone can use the same ‘API Designer’ agent rather than crafting individual prompts. Also excellent as a prompt engineering learning resource; studying how the agents are structured teaches valuable patterns for crafting your own specialized prompts. Skip if: Your work requires highly specialized domain knowledge not covered by the existing agents (the repository covers common software development roles but won’t have agents for, say, bioinformatics or financial modeling). Also skip if you need actual runtime guarantees or programmatic control—these are sophisticated prompts, not autonomous agents with tool use, memory, or validation capabilities. If you’re looking for something like LangChain agents with actual function calling and state management, this isn’t it. Finally, if you only use one AI tool and are happy crafting custom instructions manually, the cross-tool compatibility isn’t a selling point.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/ai-agents/msitarzewski-agency-agents.svg)](https://starlog.is/api/badge-click/ai-agents/msitarzewski-agency-agents)