> your AI agent picks dependencies from memory; give it dated facts — try starlog.dev ↗ vet your agent's deps ↗ vibe-coding is fine. vibe-importing isn’t. — try starlog.dev ↗ vibe-importing isn’t fine ↗ your agent has never seen your private packages — try starlog.dev ↗ facts for private packages ↗ a linter for the dependencies your AI agent picks — try starlog.dev ↗ a linter for agent deps ↗

Back to Articles

PentAGI: Multi-Agent AI Architecture for Autonomous Penetration Testing

[ View on GitHub ]

PentAGI: Multi-Agent AI Architecture for Autonomous Penetration Testing

Hook

While most security teams spend weeks manually testing the same attack vectors repeatedly, PentAGI's AI agents remember every previous penetration test through a semantic knowledge graph—learning which techniques work against specific infrastructure patterns and autonomously adapting their approach.

Context

Penetration testing has remained stubbornly manual despite decades of security automation attempts. Tools like Metasploit automate exploit delivery, and scanners like Nessus enumerate vulnerabilities, but the cognitive work—understanding context, chaining exploits, adapting to defensive responses—still requires skilled humans spending dozens of hours per assessment. This creates a capacity bottleneck: security teams can't test fast enough to match development velocity, leading to the familiar pattern of annual pentests that snapshot security posture rather than continuously validating it.

The recent explosion in LLM capabilities suggested a solution: AI agents that could reason about security contexts and make tactical decisions. But early experiments with ChatGPT-wrapper security tools hit a wall. Single-agent systems lacked reliability for complex multi-step attacks, hallucinated commands, and couldn't learn from experience. PentAGI tackles these fundamental limitations through a multi-agent architecture with persistent memory, role specialization, and a knowledge graph that builds semantic understanding across engagements. Built in Go with comprehensive observability, it's designed as production infrastructure rather than a research demo—acknowledging that autonomous offensive security tools need bulletproof monitoring and containment.

Technical Insight

PentAGI's architecture solves the autonomous pentesting problem through three key innovations: role-specialized agent delegation, semantic memory via knowledge graphs, and sandboxed execution with comprehensive observability.

The multi-agent system decomposes complex penetration tests into specialized roles. A coordinator agent analyzes the target and spawns specialized sub-agents—research agents for reconnaissance, development agents for exploit crafting, infrastructure agents for payload delivery. This delegation pattern allows smaller, cheaper LLMs to handle focused tasks reliably rather than expecting a single model to reason about an entire attack chain. The system supports optional execution monitoring where a separate agent reviews planned commands before execution, providing a safety layer that catches hallucinated or dangerous operations.

Here's how you'd configure the agent delegation in PentAGI's GraphQL API:

mutation CreatePentestTask {
  createTask(input: {
    target: "https://example-app.com"
    scope: ["*.example-app.com", "api.example-app.com"]
    agentConfig: {
      model: "gpt-4"
      delegationEnabled: true
      executionMonitoring: true
      specializedAgents: [
        { role: "research", model: "gpt-3.5-turbo" }
        { role: "development", model: "claude-sonnet-3.5" }
        { role: "infrastructure", model: "gpt-4" }
      ]
    }
    toolsEnabled: ["nmap", "burpsuite", "sqlmap", "metasploit"]
    maxDuration: 7200
  }) {
    taskId
    status
    initialPlan {
      phases
      estimatedDuration
    }
  }
}

The knowledge graph integration distinguishes PentAGI from stateless AI security tools. Every command executed, output observed, and relationship discovered gets stored in Neo4j using the Graphiti framework. This isn't simple logging—it builds semantic relationships between entities. When the system discovers that a specific web framework version correlates with SQL injection patterns, or that certain authentication endpoints consistently leak information, these relationships persist across engagements. Future pentests against similar infrastructure benefit from accumulated knowledge, making the system progressively more effective.

The Graphiti integration uses pgvector for similarity search on command outputs, enabling the system to retrieve contextually relevant past experiences. When an agent encounters an unfamiliar service, it queries the knowledge graph for similar past encounters and their successful exploitation paths. This transforms penetration testing from stateless execution into cumulative learning—arguably the most significant architectural advantage over traditional tools.

Execution happens in isolated Docker containers, each providing a complete security toolkit environment. The system includes 20+ pre-configured professional tools: Nmap, Burp Suite, SQLMap, Metasploit, Nikto, and specialized utilities for web scraping and OSINT. Agents don't just prompt-engineer security commands—they execute real tools and parse actual output. This grounds the AI in reality, eliminating the hallucination problem where models invent plausible-sounding but entirely fictional vulnerability details.

The observability stack is production-grade: Langfuse tracks every LLM interaction with token counts and latency, OpenTelemetry exports distributed traces to Jaeger, Prometheus collects metrics, and Loki aggregates logs. For a system performing autonomous offensive security operations, this monitoring density isn't optional—it's essential for understanding what agents decided, why they made specific choices, and catching potentially dangerous operations before execution. The Go implementation provides excellent performance characteristics for orchestrating multiple concurrent agents while maintaining low resource overhead for the control plane (though LLM inference and security tools obviously consume significant compute).

PentAGI supports flexible LLM provider abstraction through a unified interface that works with OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Ollama, and custom vLLM endpoints. The repository includes detailed guides for self-hosting inference using vLLM with Qwen 3.5-27B-FP8, enabling completely air-gapped deployments. This flexibility allows cost optimization strategies like using GPT-3.5-Turbo for routine reconnaissance while reserving GPT-4 for complex reasoning tasks, or running everything locally on quantized models when data sovereignty requirements prohibit external API calls.

Gotcha

The infrastructure requirements are substantial and non-negotiable. You need PostgreSQL with pgvector extensions, Neo4j for the knowledge graph, a container orchestration platform (Docker/Kubernetes), and the complete observability stack. The documentation guides you through deployment, but expect to provision meaningful compute resources—especially if self-hosting LLM inference via vLLM. This isn't a tool you pip install and run locally; it's a distributed system that requires operational expertise to maintain. For teams without existing container infrastructure or database administration capabilities, the operational burden may outweigh the autonomous testing benefits.

The effectiveness ceiling depends entirely on LLM quality, and there's an uncomfortable cost-capability tradeoff. Smaller models like Llama or Mistral struggle with the complex reasoning required for novel attack chains, even with the multi-agent architecture's task decomposition. They work acceptably for routine reconnaissance following known patterns the knowledge graph has seen before, but creative exploitation of zero-day vulnerabilities or novel configurations still demands frontier models like GPT-4 or Claude Sonnet 3.5. At scale, those API costs accumulate quickly—a comprehensive pentest might consume thousands of model calls. Meanwhile, truly autonomous operation remains aspirational: you still need experienced security engineers reviewing findings, validating vulnerabilities, and preventing false positives from triggering incident response. The system augments human expertise rather than replacing it, despite the 'autonomous' framing. Compliance and legal considerations also loom large; many organizations prohibit autonomous AI tools in offensive security contexts, and explaining to auditors that an AI decided to probe production systems rarely goes smoothly.

Verdict

Use if: You're running continuous security testing programs at scale where AI augmentation can offload repetitive reconnaissance work while security engineers focus on novel attack research and validation. You have infrastructure teams capable of operating distributed systems with databases, container orchestration, and observability stacks. You want knowledge retention across penetration tests to build institutional memory about your infrastructure's vulnerability patterns. You're comfortable with the regulatory and operational implications of autonomous security tooling and have processes for human oversight of AI-generated findings. Or you're researching multi-agent AI architectures with complex reasoning requirements and need a production-grade reference implementation beyond typical demos. Skip if: You need simple one-off penetration tests where traditional tools like Metasploit provide faster results without operational overhead. You lack the infrastructure resources or expertise to run PostgreSQL, Neo4j, container platforms, and monitoring stacks reliably. Your compliance requirements prohibit autonomous AI in offensive security contexts, or you can't accept the legal risk of AI agents making autonomous attack decisions. You're expecting fully autonomous operation without security engineering oversight—the system augments rather than replaces human expertise, and validation remains essential.