Shift: Building AI Agents That Speak HTTP

Hook

What if you could tell your proxy tool to 'bypass this WAF using polyglot XSS payloads' and watch it execute a coordinated sequence of request mutations, scope changes, and replay operations—all from a single conversational command?

Context

Web security testing has always been a discipline of repetition and pattern recognition. You craft the perfect SQL injection payload, test it against an endpoint, then spend the next hour manually adapting it across fifty similar parameters with slight variations in encoding, capitalization, or delimiters. Tools like Burp Suite and Caido provide excellent interfaces for HTTP manipulation, but they still require you to click, type, and configure each transformation manually.

Shift emerges from a simple observation: modern LLMs have gotten remarkably good at function calling—understanding intent and mapping natural language to structured API calls. While AI-powered coding assistants have transformed software development, web security testing has remained stubbornly manual. Shift bridges this gap by creating a conversational interface to Caido's capabilities, letting you describe what you want to happen rather than how to make it happen. More importantly, its micro-agent framework allows you to package testing methodologies into reusable AI agents that remember context and chain operations together.

Technical Insight

At its core, Shift implements a tool-augmented LLM architecture where the language model isn't just generating text—it's invoking specific Caido operations through a predefined function interface. The plugin exposes Caido's capabilities as callable functions that the LLM can select and execute based on natural language instructions.

The architecture follows a three-layer design: the frontend plugin interface within Caido's UI, a backend service that manages LLM communication, and a function registry that maps AI decisions to Caido operations. When you issue a command like 'add SQL injection payloads to all POST parameters,' the LLM breaks this down into discrete function calls—identifying POST requests in scope, extracting parameters, generating payloads from wordlists, and applying match-and-replace rules.

What makes Shift particularly interesting is its micro-agent framework. Rather than relying on a single general-purpose AI, you can create specialized agents with focused instructions and memory. Here's a conceptual example of how you might define a custom agent:

const xssAgent = {
  name: 'XSS Hunter',
  systemPrompt: `You are a specialized XSS exploitation agent. Your goal is to:
  1. Identify reflection points in HTTP responses
  2. Generate context-appropriate XSS payloads
  3. Apply encoding transformations based on the reflection context
  4. Test payloads systematically using Caido's Replay feature
  
  Always consider: HTML context, JavaScript context, attribute context, and URL context.
  Use polyglot payloads when context is ambiguous.`,
  
  availableFunctions: [
    'analyzeResponse',
    'generatePayloads',
    'applyEncoding',
    'createMatchReplace',
    'sendToReplay'
  ],
  
  memory: {
    testedPayloads: [],
    successfulVectors: [],
    reflectionPoints: new Map()
  }
}

This agent-based approach lets you encode domain expertise into reusable components. A WAF bypass agent might prioritize evasion techniques, while a parameter discovery agent focuses on fuzzing and analyzing response differentials.

The function-calling mechanism is crucial to understanding Shift's reliability. Unlike pure generative AI that might hallucinate commands or syntax, Shift constrains the LLM to a fixed set of operations defined in its action registry. When the model decides to 'add a header,' it's not generating arbitrary code—it's calling a validated function with typed parameters. This deterministic layer ensures that AI decisions translate to predictable Caido operations.

The contextual awareness system deserves attention too. Shift doesn't operate in a vacuum—it maintains awareness of your current Caido session state. If you're viewing a specific HTTP request and ask 'check if this parameter is vulnerable to SQLi,' the agent automatically knows which request you're referencing, which parameters exist, and what the current scope configuration looks like. This context injection happens transparently:

interface SessionContext {
  activeRequest?: HttpRequest;
  currentScope: ScopeRule[];
  recentFindings: Finding[];
  availableWordlists: Wordlist[];
  projectMetadata: ProjectInfo;
}

function buildPromptWithContext(
  userQuery: string,
  context: SessionContext
): string {
  return `
Current Context:
- Active Request: ${context.activeRequest?.url || 'None'}
- Parameters: ${extractParams(context.activeRequest)}
- Scope: ${context.currentScope.length} rules defined
- Recent Findings: ${context.recentFindings.length} items

User Query: ${userQuery}

Execute appropriate actions using available functions.
  `;
}

This context-aware design dramatically reduces the cognitive overhead of working with AI assistance. You don't need to explicitly reference request IDs or parameter names—the system infers your intent from the current workspace state.

The plugin's integration with Caido's Replay feature showcases practical AI-driven workflow automation. You can instruct Shift to 'test this request with all common authentication bypass techniques,' and it will generate appropriate payloads, create match-and-replace rules, send requests to Replay, execute them, and analyze responses for indicators of successful bypass—all without manual intervention between steps.

Gotcha

The external backend dependency is Shift's most significant limitation. Unlike traditional Caido plugins that run entirely within your local instance, Shift requires network connectivity to a backend service that interfaces with LLM providers. This architecture decision introduces latency, creates potential privacy concerns for sensitive assessments, and means you're dependent on third-party infrastructure availability. If you're conducting security testing in air-gapped environments, high-security networks, or situations where you cannot send request metadata externally, Shift simply won't work. The opt-in telemetry system attempts to address privacy concerns, but the fundamental requirement to transmit contextual information to an external LLM service remains.

The reliability question looms large for security-critical work. While the function-calling architecture constrains the AI to valid operations, the quality of those decisions still depends on the underlying LLM's reasoning capabilities and the quality of your prompts. Complex security testing scenarios—especially those requiring deep understanding of application logic, custom authentication schemes, or nuanced business logic vulnerabilities—may produce inconsistent or incorrect results. The AI might miss edge cases, apply transformations in the wrong order, or misinterpret subtle response indicators. For exploratory testing and repetitive tasks, this is acceptable. For compliance-driven security assessments that require audit trails and deterministic behavior, you'll want traditional manual workflows or purpose-built scanning tools. Think of Shift as an intelligent assistant that accelerates your work, not a replacement for security expertise and judgment.

Verdict

Use Shift if you're an active Caido user who spends significant time on repetitive HTTP manipulation tasks—parameter fuzzing, payload encoding variations, systematic testing across similar endpoints. The micro-agent framework is particularly valuable if you're part of a security team that wants to codify testing methodologies and share techniques. It excels at accelerating the tedious parts of manual security testing while keeping you in control. Skip it if you require fully offline testing environments, need deterministic and auditable behavior for compliance-driven assessments, or aren't already invested in Caido's ecosystem. The external service dependency and AI unpredictability make it unsuitable for high-stakes security work where every action needs to be explainable and reproducible. Also skip it if you're expecting autonomous vulnerability discovery—Shift augments manual testing workflows rather than replacing them with automated scanning.

Shift: Building AI Agents That Speak HTTP

Shift: Building AI Agents That Speak HTTP

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Shift: Building AI Agents That Speak HTTP

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]