RobAI: Building Async AI Agents with Function Calling in 50 Lines of Code

Hook

Most AI agent frameworks force you to learn their abstractions before writing a single line of bot logic. RobAI inverts this: decorators turn Python functions into AI tools, and three lifecycle hooks define your entire agent.

Context

The explosion of LLM capabilities created a new problem: bridging conversational AI with actual code execution. When OpenAI introduced function calling in mid-2023, it became possible for GPT models to trigger Python functions mid-conversation—essentially turning chatbots into agents that could query databases, call APIs, or manipulate data. But the existing ecosystem split into two camps: heavyweight frameworks like LangChain that abstract everything into chains and agents, requiring steep learning curves and opaque internals, or raw OpenAI API usage that forces developers to manually juggle conversation history, token counting, streaming responses, and function schema generation.

RobAI emerged as a third path: a minimal framework that handles the tedious parts (memory management, streaming, schema generation) while staying close to the metal. Created by Phil Mader, it's built around a simple premise—AI agents need three things: input preparation, response processing, and a stopping condition. By formalizing these as async hooks and adding Pydantic-powered decorators for function exposure, RobAI gives you structure without ceremony. It's not trying to be LangChain. It's trying to be the Express.js of AI agents: opinionated enough to prevent common mistakes, minimal enough to understand in an afternoon.

Technical Insight

RobAI's architecture centers on a prepare-process-stop lifecycle implemented as async methods. Every agent inherits from BaseRobot or ChatRobot and overrides these hooks. Here's a minimal example that creates a calculator agent:

from robai import ChatRobot, robot_function
from pydantic import BaseModel, Field

class CalculatorBot(ChatRobot):
    @robot_function
    def add(self, a: int, b: int) -> int:
        """Add two numbers together."""
        return a + b
    
    @robot_function
    def multiply(self, a: int, b: int) -> int:
        """Multiply two numbers."""
        return a * b
    
    async def prepare(self):
        user_input = input("You: ")
        self.prompt_manager.add_message("user", user_input)
    
    async def process(self):
        async for chunk in self.stream_chat():
            if chunk.get("function_call"):
                result = await self.execute_function(chunk["function_call"])
                self.prompt_manager.add_message("function", str(result))
            elif chunk.get("content"):
                print(chunk["content"], end="")
        print()
    
    def stop_condition(self) -> bool:
        return False  # Run forever

bot = CalculatorBot(model="gpt-4")
bot.run()

The @robot_function decorator does the heavy lifting. Under the hood, it inspects the function signature, generates OpenAI-compatible JSON schemas from type hints, and registers the function for later execution. When the LLM decides to call add(5, 3), RobAI intercepts the function call object from the streaming response, matches it to the registered function, validates arguments with Pydantic, executes the Python code, and injects the result back into the conversation as a function message—all without you writing schema definitions or validation logic.

The PromptManager handles conversation memory with surprising intelligence. It tracks total tokens, enforces configurable limits, and automatically truncates old messages when approaching context windows. You can set max_tokens=4000 and max_messages=20, and it'll keep your conversation within bounds without manual intervention. This is critical for long-running agents that would otherwise hit token limits and crash.

Streaming is first-class. The stream_chat() method yields chunks as they arrive from OpenAI's API, letting you build responsive UIs. But here's the clever part: function calls arrive as complete objects mid-stream, not character-by-character. RobAI buffers and parses these, yielding structured function call dictionaries while still streaming text content. This means you can show real-time typing effects while still executing tools deterministically.

Message handlers abstract I/O. The default ConsoleMessageHandler uses input() and print(), but swapping to WebSocketMessageHandler makes the same bot work over websockets with zero logic changes:

from robai.handlers import WebSocketMessageHandler

bot = CalculatorBot(
    model="gpt-4",
    message_handler=WebSocketMessageHandler(websocket_instance)
)

This separation means your bot logic is deployment-agnostic. The same prepare() and process() methods work in CLI scripts, web servers, Discord bots, or Slack integrations—you just swap handlers.

The async architecture is non-negotiable. Every lifecycle method is async, making it natural to call external APIs, query databases, or perform I/O-bound work inside your robot functions without blocking. This is where RobAI shows its modernity compared to synchronous agent frameworks that force thread pools or multiprocessing for concurrent operations.

Gotcha

RobAI is unapologetically tied to OpenAI. The entire ChatRobot class assumes you're using openai.ChatCompletion.create() with specific response structures. Want to use Anthropic's Claude? You'll rewrite the streaming logic. Local LLaMA models? You're forking the repo. This isn't an abstraction layer—it's a thin wrapper around OpenAI's API with nice ergonomics. For some projects, that's perfectly fine. For others, it's a dealbreaker.

The project is alpha-stage immature. Version 0.1.0 with 15 GitHub stars means you're essentially an early adopter of someone's side project. There's minimal documentation beyond docstrings, no test suite visible in the repo, and no community to troubleshoot edge cases. The framework lacks production essentials: no built-in retry logic for API failures, no cost tracking, no structured logging of function calls for debugging, no rate limiting, no telemetry. You'll need to build these yourself or accept that this is a prototype tool.

Error handling is barebones. If a robot function raises an exception, it bubbles up and likely crashes your bot loop. There's no graceful degradation, no automatic error reporting to the LLM ("I tried to execute that but encountered an error"), and no circuit breakers. You'll spend time wrapping your functions in try-except blocks and manually managing failure states. For a hackathon demo, this is fine. For a customer-facing agent, it's a liability.

Verdict

Use if: You're prototyping an OpenAI-powered agent for an internal tool, demo, or MVP where clean code matters more than framework maturity. You like reading framework source code and customizing it for your needs. You want to learn how function-calling agents work under the hood without LangChain's complexity. You're building something where vendor lock-in to OpenAI is acceptable or even preferred. Skip if: You need production-grade reliability, observability, and community support—LangChain or Semantic Kernel are safer bets despite their complexity. You require multi-provider support or plan to use local models. You're building critical infrastructure where framework abandonment risk is unacceptable. You want batteries-included features like cost tracking, retries, and monitoring without DIY work.

RobAI: Building Async AI Agents with Function Calling in 50 Lines of Code

RobAI: Building Async AI Agents with Function Calling in 50 Lines of Code

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

RobAI: Building Async AI Agents with Function Calling in 50 Lines of Code

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Pi: A Coding Agent Toolkit That Treats Your Sessions as Training Data

Open Notebook: Building a Self-Hosted NotebookLM Clone with Multi-Provider AI

Open Interpreter: Running GPT-4 with Root Access to Your Machine

The Indie Hacker's AI Arbitrage Kit: Inside 50+ Generative SaaS Templates That Treat Code as Commodity

Pi: A Coding Agent Toolkit That Treats Your Sessions as Training Data

Open Notebook: Building a Self-Hosted NotebookLM Clone with Multi-Provider AI

Open Interpreter: Running GPT-4 with Root Access to Your Machine

// CODEBASE INTELLIGENCE

Best for

Skip when