bolt.diy: Building a Browser-Based IDE That Lets LLMs Write Full-Stack Apps
Hook
What if you could prompt Claude to build a React app with a Node backend, watch it execute in your browser without Docker or a server, then switch to a local Llama model mid-conversation to save API costs?
Context
The AI-assisted coding landscape has been dominated by plugins and copilots that autocomplete your code as you type. Tools like GitHub Copilot and Tabnine excel at suggesting the next line or function, but they still require you to architect the application, set up the environment, wire dependencies, and stitch everything together. StackBlitz's commercial product Bolt.new demonstrated a radically different approach: give an LLM a single prompt like "build me a todo app with dark mode" and watch it generate, execute, and iterate on a complete application in real-time within your browser.
The problem? Bolt.new locked users into StackBlitz's infrastructure and specific LLM choices. Developers working with local models via Ollama, experimenting with cheaper alternatives like Groq, or concerned about sending proprietary code to third-party APIs had no options. bolt.diy emerged as the open-source answer—a fork that rips out the vendor lock-in while preserving the core magic: a browser-based development environment where AI agents can write code, execute it instantly, see the results, and iterate without ever touching a traditional backend server. It's not just an IDE with AI chat bolted on; it's a complete reimagining of how humans and LLMs collaborate to build software.
Technical Insight
The architectural brilliance of bolt.diy lies in three interconnected systems: provider-agnostic LLM orchestration, browser-native code execution, and intelligent conflict management.
At its core, bolt.diy wraps the Vercel AI SDK to abstract away provider differences. Instead of hardcoding OpenAI's API, it presents a unified interface that routes requests to any of 19+ providers—from Anthropic's Claude to locally-hosted Ollama models. Here's how you'd configure a custom provider:
// In app/lib/.server/llm/constants.ts
export const MODEL_LIST: ModelInfo[] = [
{
name: 'claude-3-5-sonnet-20241022',
label: 'Claude 3.5 Sonnet',
provider: 'Anthropic',
maxTokens: 8000,
},
{
name: 'llama3.1:70b',
label: 'Llama 3.1 70B (Local)',
provider: 'Ollama',
maxTokens: 4096,
}
];
// The SDK handles the rest
import { createAnthropic } from '@ai-sdk/anthropic';
import { createOllama } from 'ollama-ai-provider';
const provider = model.provider === 'Anthropic'
? createAnthropic({ apiKey: env.ANTHROPIC_API_KEY })
: createOllama({ baseURL: 'http://localhost:11434' });
This abstraction means you can start a conversation with GPT-4 for complex architecture decisions, then switch to a local Llama model for basic CRUD operations—all within the same session. The system maintains conversation context across provider switches, though token limits vary.
The second pillar is WebContainer technology, which runs an actual Node.js environment inside the browser using WebAssembly. This isn't a simulation or sandbox with limited capabilities—it's a real filesystem, package manager, and runtime. When an LLM generates code like npm install express && node server.js, that command executes in-browser with no backend proxy. The WebContainer exposes a virtual TCP stack, so your Express server genuinely listens on a port (virtualized), and the preview pane makes real HTTP requests to it.
The conflict management system is where bolt.diy shines compared to naive AI code generators. When an LLM decides to modify src/components/Header.tsx, bolt.diy implements a file locking mechanism:
// Simplified from app/lib/.server/llm/stream-text.ts
const fileLocks = new Map<string, boolean>();
async function applyFileAction(action: FileAction) {
if (fileLocks.get(action.filePath)) {
// Queue the change or reject
throw new Error(`File ${action.filePath} is locked`);
}
fileLocks.set(action.filePath, true);
try {
const diff = generateDiff(currentContent, action.content);
await webcontainer.fs.writeFile(action.filePath, action.content);
// Show diff in UI for user review
emitDiffEvent(action.filePath, diff);
} finally {
fileLocks.delete(action.filePath);
}
}
This prevents race conditions when the LLM generates multiple file changes in rapid succession. The diff view lets you see exactly what changed between iterations—critical when debugging why the AI's "fix" broke your authentication flow. Unlike tools that silently overwrite files, bolt.diy surfaces every modification as a reviewable change.
The prompt engineering happens through a structured artifact system. Instead of raw text responses, LLMs return XML-like structures that bolt.diy parses into actionable commands:
<boltArtifact id="project-setup" type="application">
<boltAction type="file" filePath="package.json">
{"dependencies": {"express": "^4.18.0"}}
</boltAction>
<boltAction type="shell">
npm install
</boltAction>
<boltAction type="file" filePath="server.js">
const express = require('express');
// ... generated code
</boltAction>
</boltArtifact>
This structured output ensures the LLM's intent is unambiguous—no regex parsing of markdown code blocks that might hallucinate filenames or commands. The system prompts include instructions teaching the LLM this format, plus context about the current project structure so it knows which files already exist.
Gotcha
The "full-stack" marketing hits a hard wall at language support. bolt.diy only runs Node.js—period. If your "full-stack" definition includes a Python FastAPI backend, a Go microservice, or Ruby on Rails, you're out of luck. This isn't a minor limitation; it fundamentally constrains the architecture patterns you can explore. WebContainer's browser-based execution means no native binaries, no system-level dependencies beyond what Node can provide, and no escape hatch to run Docker containers.
The LLM quality variance is jarring. With flagship models like GPT-4 or Claude 3.5 Sonnet, bolt.diy impressively scaffolds working applications. Switch to a smaller model like Mistral 7B via Ollama, and you'll watch it generate malformed package.json files, import non-existent dependencies, or create circular reference bugs. The project's documentation admits prompt optimization is "in progress," which really means: budget models produce budget results. You're paying the AI tax either through expensive API calls or wasted time debugging hallucinated code. There's also no rollback mechanism beyond manually reverting files—if the LLM's fifth iteration breaks everything, you're clicking through diffs to find where it went wrong.
Verdict
Use if: You need to prototype Node.js applications fast, want to experiment with different LLM providers without rebuilding infrastructure, or you're learning full-stack development and value seeing immediate results over understanding every abstraction layer. It's particularly powerful for non-developers who need functional MVPs—founders validating SaaS ideas, designers building interactive prototypes, or educators demonstrating concepts without environment setup friction. The Ollama integration makes it viable for privacy-conscious scenarios where cloud APIs are forbidden. Skip if: You're building production applications requiring multi-language backends, need robust version control integration beyond basic Git exports, or work in regulated industries where sending code to third-party LLMs violates compliance. Traditional IDEs like Cursor or GitHub Copilot offer better debugging, testing frameworks, and CI/CD integration once you move past the prototyping phase. Also skip if your team's workflow depends on Docker, complex build pipelines, or deploying to non-JavaScript runtimes—bolt.diy's browser sandbox makes those impossible.