> your AI agent picks dependencies from memory; give it dated facts — try starlog.dev ↗ vet your agent's deps ↗ vibe-coding is fine. vibe-importing isn’t. — try starlog.dev ↗ vibe-importing isn’t fine ↗ your agent has never seen your private packages — try starlog.dev ↗ facts for private packages ↗ a linter for the dependencies your AI agent picks — try starlog.dev ↗ a linter for agent deps ↗

Back to Articles

Building AI Agents Without Frameworks: A First-Principles Approach to Agentic Patterns

[ View on GitHub ]

Building AI Agents Without Frameworks: A First-Principles Approach to Agentic Patterns

Hook

Most developers building AI agents have no idea how they actually work—they're just chaining LangChain components and hoping for the best. The neural-maze/agentic-patterns-course repository strips away the framework magic to show you the raw mechanics.

Context

The explosion of AI agent frameworks in 2023-2024 created a peculiar problem: developers could build functioning agents without understanding the underlying patterns. LangChain, LangGraph, AutoGen, and CrewAI made it easy to spin up impressive demos, but the abstraction layers obscured fundamental concepts. When agents failed, developers had no mental model for debugging. When requirements diverged from framework opinions, customization became archaeology.

Andrew Ng identified this gap in his influential DeepLearning.AI talk on agentic patterns, where he outlined four foundational approaches: Reflection (iterative self-improvement), Tool Use (function calling for external data), Planning (multi-step reasoning), and Multi-Agent (collaborative task execution). The neural-maze/agentic-patterns-course repository implements these patterns as educational demonstrations using direct Groq API calls—no frameworks, no abstractions, just the essential orchestration logic that makes agents work. It's the equivalent of learning web development by building an HTTP server from scratch before using Express.

Technical Insight

The repository's architecture decision to avoid frameworks exposes what's actually happening when an agent "thinks." Take the Reflection pattern, which implements iterative self-critique to improve LLM outputs. The entire pattern boils down to a loop with two LLM calls: one for generation, one for critique.

Here's the core implementation from their reflection module:

def generate_with_reflection(prompt, max_iterations=3):
    content = prompt
    
    for iteration in range(max_iterations):
        # Generate response
        generation_response = client.chat.completions.create(
            messages=[{"role": "user", "content": content}],
            model="llama-3.1-70b-versatile",
        )
        generated_content = generation_response.choices[0].message.content
        
        # Reflect on the response
        reflection_prompt = f"""Review this response and provide specific critique:
        {generated_content}
        
        Identify weaknesses, missing elements, or improvements needed."""
        
        reflection_response = client.chat.completions.create(
            messages=[{"role": "user", "content": reflection_prompt}],
            model="llama-3.1-70b-versatile",
        )
        critique = reflection_response.choices[0].message.content
        
        # Build next iteration prompt
        content = f"""Original request: {prompt}
        Previous attempt: {generated_content}
        Critique: {critique}
        
        Generate an improved response addressing the critique."""
    
    return generated_content

That's it. No special reflection framework, no complex state management—just prompt engineering and orchestration. The pattern works because each iteration feeds the previous output and its critique back into context, allowing the model to iteratively refine its response. Framework implementations hide this simplicity behind classes, callbacks, and configuration objects.

The Tool Use pattern demonstrates function calling with similar transparency. Instead of framework-specific tool decorators, it uses Groq's native function calling API with explicit JSON schema definitions:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="llama-3.1-70b-versatile",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    tool_choice="auto"
)

if response.choices[0].message.tool_calls:
    tool_call = response.choices[0].message.tool_calls[0]
    function_args = json.loads(tool_call.function.arguments)
    result = get_weather(**function_args)  # Execute actual function
    # Feed result back to LLM for final response

The Planning pattern implements ReAct (Reasoning and Acting), the foundation of agents like AutoGPT. The implementation reveals that ReAct is just a loop that alternates between "thought" prompts (reasoning about what to do next) and "action" prompts (executing tools based on that reasoning). The repository structures this as a simple state machine that tracks the thought-action-observation cycle until the agent decides it's done.

What makes this approach particularly valuable is seeing how Groq's high-speed inference (outputting hundreds of tokens per second) enables more complex agentic workflows. Traditional OpenAI API calls with 30-50 tokens/second make multi-turn agent interactions feel sluggish. Groq's speed means reflection loops with 3-5 iterations complete in seconds rather than minutes, making these patterns actually practical for real-time applications.

The Multi-Agent pattern coordinates specialized agents using explicit message passing rather than framework orchestration. Each agent is just a function with a specific system prompt and tool access. The coordinator agent receives the user request, delegates to specialist agents (like a researcher and writer), and synthesizes their outputs. No complex actor systems or pub/sub infrastructure—just structured prompts and sequential API calls that clearly show how agent collaboration actually works under the hood.

Gotcha

The deliberate simplicity that makes this repository educational also creates production gaps. Error handling is minimal—if Groq's API returns a rate limit error mid-reflection loop, the code doesn't gracefully retry or cache partial results. The implementations assume happy paths, which is fine for learning but dangerous for production systems where LLM APIs fail, return malformed JSON, or hit context limits.

The tight coupling to Groq is both a feature and a constraint. While Groq's speed makes agentic patterns more viable, the code hardcodes the provider. Switching to OpenAI, Anthropic, or local models requires manual refactoring of API calls, parameter mappings, and function calling schemas. The repository includes no abstraction layer for provider swapping, which is intentional for educational clarity but limiting for real-world use. You also inherit Groq's limitations: their function calling support is newer and less mature than OpenAI's, and model selection is narrower. If you need GPT-4 or Claude specifically, you're rewriting significant chunks.

The four patterns covered are foundational but incomplete for complex applications. There's no implementation of memory systems (short-term conversation history vs. long-term knowledge storage), no hierarchical planning for multi-step projects, no human-in-the-loop approval workflows, and no agent monitoring or observability hooks. These aren't oversights—they're outside the educational scope—but they're gaps you'll hit immediately when moving beyond toy examples.

Verdict

Use if: You're learning how AI agents work and want to understand the actual mechanics beneath framework abstractions, you're prototyping a simple agentic workflow and want minimal dependencies, you need reference implementations to understand what LangChain/LangGraph are actually doing, or you're building educational content about AI agents. The code quality and structure make it ideal for reading and adapting. Skip if: You're building production systems that need robust error handling and observability, you require provider flexibility to switch between OpenAI/Anthropic/local models, you need advanced agentic features like persistent memory or complex orchestration, or you want batteries-included frameworks that handle edge cases. This is a teaching tool and reference implementation—treat it as such, learn from it, then graduate to production frameworks when requirements demand them.