Smol Developer: Building Apps Through Iterative Prompt Engineering
Hook
What if scaffolding a new app didn’t mean choosing between create-react-app, create-next-app, or a dozen other rigid starter templates—but instead meant describing what you want in plain English and iterating until it works?
Context
The traditional approach to starting new projects involves either copying boilerplate from rigid starter templates (create-react-app, create-next-app) or manually scaffolding everything from scratch. Both approaches have problems: templates are opinionated and constrain your architecture, while manual scaffolding is time-consuming and error-prone.
Smol developer takes a radically different approach. Instead of maintaining specific one-shot starters, it uses GPT-4 to generate entire codebases from natural language prompts—a concept described in the README as ‘human-centric & coherent whole program synthesis.’ The key insight isn’t just that AI can write code (we’ve known that since Copilot), but that with the right workflow, you can treat an LLM as a junior developer you iterate with, rather than a magic box that either works perfectly or fails completely. As the README puts it: ‘Build the thing that builds the thing’—smol developer is a meta-tool for creating your own custom scaffolding agents.
Technical Insight
Smol developer’s architecture revolves around a three-stage pipeline that breaks whole-program synthesis into manageable chunks. First, it generates a shared dependencies plan—a high-level specification of libraries, frameworks, and architectural decisions. Second, it uses OpenAI’s Function Calling API to specify file paths needed for the project, guaranteeing structured JSON output. Finally, it generates code for each file, using the shared dependencies as context to maintain coherence across the codebase.
The library mode shows this pipeline clearly. Here’s how you’d use it in your own Python application:
from smol_dev.prompts import plan, specify_file_paths, generate_code_sync
prompt = "a HTML/JS/CSS Tic Tac Toe Game"
# Stage 1: Generate architectural plan
shared_deps = plan(prompt)
# Returns a string describing dependencies, frameworks, architecture
# Stage 2: Determine file structure
file_paths = specify_file_paths(prompt, shared_deps)
# Returns ['index.html', 'game.js', 'styles.css', ...]
# Stage 3: Generate code for each file
for file_path in file_paths:
code = generate_code_sync(prompt, shared_deps, file_path)
# Write to disk, display in UI, etc.
This staged approach is crucial. Rather than asking GPT-4 to generate everything at once (which could lead to inconsistencies), smol developer first establishes a ‘contract’ through shared_deps, then uses that contract as context for each file generation. The Function Calling API ensures file_paths is always valid JSON, eliminating a common failure mode in LLM-based tools.
The CLI mode demonstrates the intended human-in-the-loop workflow:
python main.py "a HTML/JS/CSS Tic Tac Toe Game"
# Generates initial codebase
# After running and finding issues:
python main.py --prompt updated_prompt.md
# where updated_prompt.md contains:
# "a HTML/JS/CSS Tic Tac Toe Game
# ERROR: game doesn't detect diagonal wins
# FIX: add diagonal win condition check"
This ‘engineering with prompts, rather than prompt engineering’ philosophy means you don’t need perfect prompts upfront. You run the code, paste errors back into the prompt, and regenerate—exactly like you’d file issues for a junior developer. The README explicitly positions this as an iterative loop: ‘Loop until happiness is attained. Notice that AI is only used as long as it is adding value.’
Smol developer also supports three deployment modes. The CLI mode clones the repo and runs locally. The library mode (pip install smol_dev) embeds the agent in your own applications. The API mode implements the Agent Protocol standard, allowing integration with tools that support that specification. This flexibility is rare among AI coding tools.
Gotcha
Smol developer has significant limitations worth understanding. First, cost and model dependency: the tool defaults to gpt-4-0613, and while you can use gpt-3.5-turbo-0613 via the —model flag, the README positions GPT-4 as the primary model. Generating even a simple tic-tac-toe game involves multiple API calls (plan, specify_file_paths, then generate_code for each file), which can add up at GPT-4 pricing.
Second, the tool appears designed primarily for greenfield projects. The README describes smol developer as creating ‘create-anything-app’ scaffolds and explicitly states that ‘AI is only used as long as it is adding value - once it gets in your way, just take over the codebase from your smol junior developer.’ The debugger.py mentioned in the README ‘reads the whole codebase to make specific code change suggestions,’ but is described as something you can use when you ‘manually runs the code and identifies errors.’ The workflow suggests you can iterate by adding to your prompt, but the architecture appears optimized for initial scaffolding rather than incremental modifications.
Third, error recovery requires human involvement. While you can paste errors into prompts and regenerate, the human-in-the-loop philosophy means you’re the one running the code, identifying issues, and formulating fixes. The README emphasizes this collaborative approach: humans write prompts, AI generates code, humans run and read it, then add to the prompt based on what they discover.
Verdict
Use smol developer if you’re prototyping new applications from scratch and want to skip the boilerplate phase entirely. It excels at generating MVPs for well-defined domains: browser extensions, simple web apps, CLI tools, or any project where you can articulate requirements clearly. It’s particularly valuable if you’re comfortable with prompt engineering and willing to iterate—think of it as a force multiplier for the ‘write a rough draft, then refine’ workflow. The library mode is especially powerful for teams building internal tools or platforms that need AI-powered code generation as a feature.
Skip it if you’re working with existing codebases that need incremental changes (the tool appears optimized for greenfield scaffolding), require production-ready code without human review (smol developer generates drafts that need human oversight), or are cost-sensitive (multiple GPT-4 API calls per project generation). Also be cautious if your project requires domain knowledge outside GPT-4’s training data. Finally, if you want fully autonomous agents that self-correct without human input, this isn’t that tool—smol developer explicitly embraces human-in-the-loop collaboration rather than full autonomy.