Swark: Auto-Generating Architecture Diagrams by Feeding Your Codebase to GitHub Copilot

Hook

What if documenting your system architecture was as simple as highlighting files and pressing a hotkey? Swark proves that LLMs can reverse-engineer your codebase into diagrams faster than you can schedule the architecture sync meeting.

Context

Architecture diagrams rot the moment you commit them. Teams invest hours crafting Lucidcharts or draw.io masterpieces, only to watch them become lies as the codebase evolves. Manual diagram maintenance competes with feature work and inevitably loses. Traditional static analysis tools like Doxygen or SourceTrail generate dependency graphs, but these are mechanical visualizations of imports and function calls—they don't capture conceptual architecture, data flow, or the "why" behind your design decisions.

Swark takes a fundamentally different approach: it treats architecture diagramming as a code comprehension problem solvable by large language models. Rather than building language-specific parsers for TypeScript, Python, Go, and dozens of other languages, it delegates the hard work of understanding code to GitHub Copilot. Point it at your source files, and it generates Mermaid.js diagrams that represent component relationships, data flows, and system boundaries. Because it outputs to Mermaid's text-based format, these diagrams live alongside your code in version control, editable with any text editor.

Technical Insight

Swark's architecture reveals thoughtful decisions about LLM integration and developer experience. The extension operates through four distinct phases that manage the complexity of code-to-diagram transformation.

The file retrieval system uses configurable glob patterns to select source files. You define which files matter through VS Code's settings, filtering by extension or path patterns. This is critical because LLMs have token limits—you can't feed an entire monorepo into a context window. The extension intelligently adjusts file counts based on available token budget, prioritizing files you've explicitly selected or opened recently.

Prompt construction is where the magic happens. Swark bundles your selected files with carefully crafted instructions that guide the LLM toward generating valid Mermaid syntax. The prompt engineering here is crucial—it must balance specificity ("generate a component diagram showing dependencies") with flexibility (allowing the LLM to infer relationships it discovers in code). Here's what a typical workflow looks like:

// Example: Generating a diagram for a simple REST API
// Files selected: server.ts, routes/users.ts, database/connection.ts

// server.ts
import express from 'express';
import { userRoutes } from './routes/users';
import { connectDB } from './database/connection';

const app = express();
connectDB();
app.use('/api/users', userRoutes);

// Swark analyzes these files and generates:
graph TB
    Server[Express Server]
    UserRoutes[User Routes]
    Database[PostgreSQL Database]
    
    Server --> UserRoutes
    Server --> Database
    UserRoutes --> Database

The LLM invocation phase leverages VS Code's Language Model API, which is the architectural decision that makes Swark unique. Instead of requiring API keys for OpenAI, Anthropic, or other providers, it piggybacks on your existing GitHub Copilot subscription. This is brilliant for two reasons: cost (Copilot users already pay for LLM access) and privacy (code never leaves the VS Code → GitHub Copilot trust boundary).

The rendering stage converts LLM responses into visual diagrams using Mermaid.js in a markdown preview pane. Mermaid is a text-based diagramming syntax that renders in browsers, GitHub READMEs, and most documentation tools. This choice means outputs are inherently version-controllable and editable. If the LLM generates a diagram that's 80% correct, you can manually tweak the Mermaid syntax rather than regenerating from scratch.

Swark also implements cycle detection for Mermaid graph validation. LLMs occasionally hallucinate invalid syntax or create circular references that break rendering. The extension catches these errors and can retry generation with modified prompts. This error handling transforms what could be a frustrating black-box experience into something usable in real workflows.

The token budget management deserves special attention. LLMs like GPT-4 have context limits around 8K-128K tokens depending on the model. A single large TypeScript file might consume 5,000 tokens. Swark automatically calculates available budget and adjusts which files get included in the prompt. If you select 50 files but only have budget for 10, it intelligently samples based on heuristics like file size, import frequency, and recency. This automatic scaling prevents the most common LLM integration failure mode: context overflow errors.

Gotcha

Swark's biggest limitation is its complete dependence on GitHub Copilot's black-box LLM. You have no control over model selection, temperature settings, or fallback strategies. If Copilot generates a nonsensical diagram, your only recourse is regenerating and hoping for better luck. Unlike deterministic static analysis tools, you can't debug why a relationship is missing or incorrectly represented. This non-determinism makes Swark unsuitable for scenarios requiring consistent, auditable outputs.

The quality ceiling is also lower than manual diagramming for complex systems. LLMs excel at understanding common patterns—REST APIs, MVC architectures, microservice boundaries—but struggle with domain-specific abstractions or unconventional designs. If your codebase uses a custom event-driven architecture with bespoke message passing, Swark might generate a generic component diagram that misses the nuance. You'll get something useful for onboarding or documentation drafts, but probably not the polished artifact you'd present to stakeholders. Additionally, telemetry collection is enabled by default (opt-out rather than opt-in), which may concern privacy-conscious developers even though Swark claims to exclude source code from telemetry payloads.

Verdict

Use if: You need quick architecture visualizations for understanding unfamiliar codebases, want to bootstrap documentation for polyglot projects without writing custom parsers, or value iteration speed over diagram perfection. Swark shines for solo developers and small teams already paying for GitHub Copilot who treat diagrams as living drafts rather than canonical artifacts. Skip if: You require deterministic outputs for compliance or audit purposes, work offline or in air-gapped environments, need pixel-perfect control over diagram layout and styling, or your architecture uses domain-specific patterns that general-purpose LLMs won't understand. For production-grade, maintained architecture documentation, invest in tools like Structurizr or manually authored Mermaid diagrams instead.

Swark: Auto-Generating Architecture Diagrams by Feeding Your Codebase to GitHub Copilot

Swark: Auto-Generating Architecture Diagrams by Feeding Your Codebase to GitHub Copilot

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

Swark: Auto-Generating Architecture Diagrams by Feeding Your Codebase to GitHub Copilot

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

ds4: The SSD-Streaming Inference Engine That Treats Your Mac's NVMe Like RAM

Nanocoder: The Terminal Coding Agent That Lets You Switch Models Mid-Conversation

Shard: Proving LLM Inference Can Work Across Scattered GPUs and Terrible Internet

Harness-1: Training Search Agents with State Externalization

ds4: The SSD-Streaming Inference Engine That Treats Your Mac's NVMe Like RAM

Nanocoder: The Terminal Coding Agent That Lets You Switch Models Mid-Conversation

Shard: Proving LLM Inference Can Work Across Scattered GPUs and Terrible Internet

// CODEBASE INTELLIGENCE

Best for

Skip when