Back to Articles

Building a Sandboxed GPT Agent with Function Calling: Lessons from Core Agent

[ View on GitHub ]

Building a Sandboxed GPT Agent with Function Calling: Lessons from Core Agent

Hook

Most GPT agent tutorials skip the hard part: how do you let an AI modify files without turning your filesystem into a ransomware playground?

Context

When OpenAI introduced function calling in 2023, developers gained the ability to connect GPT models to external tools and APIs. But theory diverged sharply from practice—building an agent that safely executes commands, reads files, and maintains conversation context requires solving token budget limits, implementing security boundaries, and orchestrating iterative function chains. Early attempts either locked down permissions so tightly the agent became useless, or opened security holes wide enough to drive a truck through.

Core Agent, a Python CLI tool from the avogabos/agents repository, represents one developer’s attempt to thread this needle. It’s not a production framework or a polished library—it’s a working prototype that demonstrates practical patterns for building conversational agents with controlled system access. The codebase shows how to sandwich filesystem operations between safety checks, manage token budgets with dual GPT instances, and structure function calling loops that let the model chain operations autonomously. For developers building their first agent or trying to understand what actually happens between “user types message” and “GPT executes function,” this repository offers something more valuable than abstraction: implementation details.

Technical Insight

Token Management

Security Layer

Yes

Large

Small

No

User CLI Input

Message History List

GPT-4o Main Client

Tool Calls?

Function Executor

Path Validator

Sandboxed File Operations

Output Size?

GPT Summarizer Client

Tool Result Message

Assistant Response

System architecture — auto-generated

The architecture revolves around a message history list that serves as the conversation’s single source of truth. Every interaction—system prompts, user messages, assistant responses, and critically, function call results—appends to this list, which gets sent to GPT on each request. Here’s the core loop structure:

while True:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools
    )
    
    message = response.choices[0].message
    messages.append(message)
    
    if message.tool_calls:
        for tool_call in message.tool_calls:
            result = execute_function(tool_call)
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result
            })
        continue  # Let GPT process results
    else:
        break  # Return control to user

This pattern allows GPT to chain multiple operations—reading a file, analyzing its content, then writing results—without user intervention between steps. The continue statement is crucial: instead of immediately showing output, the loop feeds function results back to GPT, enabling autonomous multi-step reasoning.

The security model implements directory sandboxing with path resolution validation. Every file operation receives a target directory at startup, and all paths get normalized through os.path.abspath() and os.path.commonpath() checks to prevent directory traversal attacks. The shell command execution function goes further with a blacklist approach:

DANGEROUS_COMMANDS = ['rm', 'rmdir', 'del', 'format', 
                      'dd', 'mkfs', 'sudo', 'chmod']

def execute_command(command: str) -> str:
    base_command = command.split()[0]
    if base_command in DANGEROUS_COMMANDS:
        return f"Command '{base_command}' is not allowed"
    
    result = subprocess.run(
        command,
        shell=True,
        capture_output=True,
        text=True,
        timeout=30
    )
    return result.stdout + result.stderr

This isn’t bulletproof—sophisticated prompt injection could potentially bypass these checks—but it establishes a permission boundary that blocks accidental catastrophes while keeping the agent functional.

The dual-client pattern addresses a less obvious problem: GPT’s context window fills up fast when function outputs are verbose. If you read a 10,000-line log file and append it directly to message history, you’ve burned through your token budget in one operation. Core Agent solves this with a second GPT instance dedicated to summarization:

def summarize_large_output(content: str, 
                           max_length: int = 1000) -> str:
    if len(content) <= max_length:
        return content
    
    summary_response = summary_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"Summarize this output concisely:\n\n{content}"
        }]
    )
    return summary_response.choices[0].message.content

This trades compute cost for context preservation. Large outputs get compressed by a cheaper model before entering the main conversation history, letting the primary agent maintain more conversational context. It’s a pragmatic optimization that production agent systems eventually implement once they hit token limit errors in real usage.

The function definitions themselves follow OpenAI’s structured schema, but the repository demonstrates useful patterns for common operations. The read_file function returns both content and metadata. The analyze_image function shows how to pass base64-encoded images to GPT’s vision capabilities. The list_directory function includes file sizes and modification times, giving the agent contextual information for decision-making. These aren’t groundbreaking implementations, but they’re production-adjacent—handling edge cases like missing files, permission errors, and encoding issues that tutorial code typically skips.

Gotcha

The repository’s most immediate limitation is right in the code: it references ‘gpt-4o’ and ‘gpt-4o-mini’ models that don’t exist in OpenAI’s actual model catalog. You’ll need to manually change these to ‘gpt-4’ or ‘gpt-3.5-turbo’ before anything runs. This isn’t just a typo—it reveals the repository’s nature as personal prototype code that hasn’t been battle-tested by community users who would have immediately filed issues.

More fundamentally, the error handling is nearly nonexistent. If the OpenAI API returns a rate limit error, network timeout, or malformed response, the agent crashes rather than gracefully recovering. There’s no retry logic, no exponential backoff, no mechanism to save conversation state and resume. In a real conversational agent, users expect to continue interrupted sessions, but Core Agent treats each run as ephemeral. The session logging to JSON files provides some persistence, but there’s no built-in way to load and continue a previous conversation. The security model also shows its prototype nature—the command blacklist is easily circumvented by anyone who knows that rm and /bin/rm are functionally identical, and the path validation doesn’t account for symlinks. These aren’t criticisms of the author’s skills, but acknowledgments that building production-ready agent security requires threat modeling and testing that a solo project hasn’t undergone. The repository serves its purpose as a learning artifact and reference implementation, not as a dependency you’d import into production code.

Verdict

Use if: You’re building your first GPT agent and want to see working code that demonstrates function calling patterns, conversation state management, and basic security sandboxing without framework abstraction hiding the implementation details. This repository is excellent reference material for understanding how the pieces fit together—how message history flows through the loop, how function results feed back to the model, how to structure tool definitions. It’s also useful if you’re prototyping a custom agent with specific requirements that frameworks don’t address, and you need a starting template to modify rather than building from scratch. Skip if: You need production-ready agent infrastructure with error recovery, session management, and community validation. The hardcoded model names, minimal error handling, and single-contributor codebase make this unsuitable for anything beyond learning and prototyping. For production work, invest time in LangChain’s agent implementations or OpenAI’s Assistants API—they’ve solved the hard problems around state persistence, error recovery, and security that this prototype deliberately leaves unaddressed. Also skip if you’re looking for extensive documentation or examples; the repository assumes you’ll read the code directly.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/ai-agents/avogabos-agents.svg)](https://starlog.is/api/badge-click/ai-agents/avogabos-agents)