Back to Articles

AgentSonar: Detecting Shadow AI with Traffic Heuristics Instead of Domain Blocklists

[ View on GitHub ]

AgentSonar: Detecting Shadow AI with Traffic Heuristics Instead of Domain Blocklists

Hook

Your employees are feeding proprietary code to ChatGPT right now, and your firewall has no idea because it’s just HTTPS traffic to another cloud API. Traditional network security sees what you permit; AgentSonar finds what you didn’t know to block.

Context

The shadow IT problem has evolved. A decade ago, security teams worried about unauthorized Dropbox accounts and personal Gmail usage. Today, the risk vector is AI agents—developers pasting code into Claude for refactoring, product managers feeding customer data to ChatGPT for analysis, finance teams uploading spreadsheets to AI tools for summarization. These interactions create immediate compliance violations, intellectual property leakage, and regulatory exposure.

Traditional approaches fail because they’re reactive. Maintaining domain blocklists is a losing game when new AI tools launch weekly. DNS filtering breaks with DoH/DoT. SSL inspection creates its own security risks and performance overhead. DLP solutions focus on file transfers, not the conversational, streaming nature of LLM APIs. You need a system that understands what AI traffic looks like rather than where it goes—a behavioral approach instead of a signature-based one. That’s the gap AgentSonar fills.

Technical Insight

Heuristic Analysis

Raw network packets

Socket metadata

Process-domain pairs

Known patterns

Blocklist check

Unknown traffic

Byte asymmetry

Packet patterns

Streaming score

Matched agents

Filtered

Scored events

Query/Export

Packet Capture

libpcap

Connection Tracker

Process Correlator

/proc or lsof

Detection Layers

Explicit Agent

Definitions

Noise Filter

Ignore Lists

Heuristic Classifier

Scoring Engine

Event Database

Discard

UI/Reporting

Interactive Triage

System architecture — auto-generated

AgentSonar’s architecture combines three detection layers: explicit agent definitions for known tools, noise filtering for false positives, and a heuristic classifier that scores unknown traffic based on behavioral patterns characteristic of LLM interactions.

The core insight is that LLM API traffic has a distinctive signature. When you send a prompt to GPT-4 or Claude, you transmit a small request (typically <2KB) and receive a large, streaming response chunked over multiple packets. This creates measurable asymmetry in byte counts and packet distribution. AgentSonar captures this with libpcap, correlates connections to processes via /proc filesystem lookups (Linux) or lsof calls (macOS), and scores each process-domain pair.

Here’s how the heuristic classifier evaluates a connection:

// Simplified from AgentSonar's classifier logic
type ConnectionScore struct {
    ByteAsymmetry    float64  // Ratio of inbound to outbound bytes
    PacketAsymmetry  float64  // Ratio of inbound to outbound packets
    StreamingScore   float64  // Prevalence of small packet bursts
    TLSHandshake     bool     // SNI/ALPN indicators
    Duration         int      // Connection lifetime
}

func ClassifyConnection(conn *Connection) float64 {
    score := 0.0
    
    // LLM responses are typically 10-100x larger than prompts
    if conn.BytesIn > conn.BytesOut * 10 {
        score += 0.3
    }
    
    // Streaming responses arrive in many small packets
    if conn.PacketsIn > 50 && avgPacketSize(conn) < 1500 {
        score += 0.25
    }
    
    // Small outbound followed by sustained inbound = likely AI
    if isAsymmetricStream(conn) {
        score += 0.25
    }
    
    // TLS 1.3 with ALPN h2 is common for modern AI APIs
    if conn.TLS && conn.ALPN == "h2" {
        score += 0.2
    }
    
    return score
}

The tool runs in daemon mode, continuously capturing packets and updating a local SQLite database. Each event records the timestamp, process name, PID, destination domain, IP, port, and the classifier’s confidence score. This creates an audit trail that security teams can query:

# Find all high-confidence AI activity from the last week
agentsonar query --since 7d --min-score 0.7

# Interactive triage mode for reviewing unknowns
agentsonar triage --status unknown

The triage workflow is particularly clever. When AgentSonar sees traffic it can’t definitively classify (score between 0.4-0.7), it prompts you to review. You can mark it as a known agent, add it to the noise filter, or flag it for investigation. These decisions feed back into the local classification database, creating an organization-specific knowledge base.

For environments where process correlation fails—containerized apps, network taps, proxy scenarios—AgentSonar supports --enable-pid0 mode. This sacrifices process attribution but allows deployment at network choke points. You lose the “which application” context but gain coverage across all traffic:

# Gateway/proxy mode without process tracking
agentsonar monitor --interface eth0 --enable-pid0 --promiscuous

The architecture also supports offline analysis. You can feed packet captures or JSON-formatted connection logs via stdin, making AgentSonar composable with existing network monitoring infrastructure:

# Process existing pcap files
tcpdump -r capture.pcap -w - | agentsonar classify --format pcap

# Integrate with network flow exports
cat netflow.json | agentsonar classify --format json

This flexibility means you can start small—run it on a few developer laptops to understand baseline AI usage—then scale to network-wide deployment as you refine your classification rules.

Gotcha

Process-to-socket correlation is fragile and breaks in common scenarios. If your developers work primarily in Docker containers, Kubernetes pods, or behind corporate proxies, AgentSonar can’t attribute traffic to specific applications. The /proc/net/tcp mappings it relies on don’t traverse container boundaries without privileged access, and proxy traffic all appears to originate from the proxy process. The --enable-pid0 fallback works but reduces value significantly—you know someone is using an AI tool, but not which team or application.

The heuristic classifier will produce false positives on legitimate streaming APIs that share traffic patterns with LLMs. Video streaming, real-time analytics dashboards, financial data feeds—anything with small requests and large, chunked responses can trigger high scores. A developer working with the YouTube API or Stripe webhooks might get flagged. You’ll need to invest time in building your noise filter, and in diverse environments, this can mean weeks of triage work before the signal-to-noise ratio becomes acceptable. There’s no free lunch: avoiding domain blocklists means accepting behavior-based false positives.

The tool also requires elevated privileges that may conflict with endpoint security policies. BPF access on macOS and CAP_NET_RAW/CAP_NET_ADMIN on Linux are non-negotiable for packet capture, and many hardened environments explicitly deny these capabilities to non-root processes. If your organization uses Endpoint Detection and Response (EDR) tools with strict process monitoring, AgentSonar’s network access might trigger alerts or be blocked entirely.

Verdict

Use AgentSonar if you need visibility into shadow AI usage in a small-to-medium organization with a mix of BYOD and flexible development environments where traditional network controls are impractical. It’s particularly valuable if you’re building a compliance program around AI usage and need evidence of what tools are actually being used, not just policy documents about what’s permitted. The heuristic approach makes it resilient to the constant churn of new AI services. Skip it if you already have comprehensive endpoint DLP with AI detection, work primarily in containerized environments where process tracking doesn’t function, or if your organization is small enough to enforce AI usage through policy and monthly spend reviews rather than technical controls. Also skip if you have strict endpoint security requirements that prevent packet capture—fighting your EDR to run AgentSonar isn’t worth it. For enterprises with established SIEM and network monitoring infrastructure, AgentSonar’s offline classification mode can augment existing tools, but it’s not a replacement for comprehensive network visibility.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/ai-agents/knostic-agentsonar.svg)](https://starlog.is/api/badge-click/ai-agents/knostic-agentsonar)