AIMap: Scanning the Internet for Exposed AI Infrastructure Before Attackers Do

Hook

Over 10,000 publicly exposed Ollama servers are running on the internet right now, many allowing unauthenticated model inference and extraction. AIMap is the first tool built specifically to find them at scale.

Context

The AI deployment landscape has exploded faster than security practices could keep up. Organizations are spinning up LLM inference servers, agent frameworks, and AI-powered APIs without the decades of hardening that traditional web infrastructure received. Unlike web servers where security best practices are well-established, AI services often ship with authentication disabled by default—Ollama listens on 0.0.0.0:11434 with no auth, vLLM exposes /v1/chat/completions to anyone, and the Model Context Protocol (MCP) frequently runs without transport security.

Traditional reconnaissance tools like Shodan and Censys can find these services, but they lack the domain knowledge to differentiate between a hardened production deployment and a wide-open development instance. You'd need to craft custom queries for each AI framework, manually verify which endpoints are actually exposed, then assess risk based on protocol-specific threat models. Bishop Fox's AIMap solves this by building a specialized scanning platform that understands AI infrastructure natively—it knows that an Ollama server exposing /api/tags is riskier than one that simply responds to health checks, and that an MCP server advertising 'execute_command' tools represents a critical threat.

Technical Insight

AIMap's architecture is a masterclass in building distributed scanning infrastructure. The backend uses FastAPI with async patterns throughout, storing results in MongoDB via Motor (the async driver) while streaming real-time updates through Redis Streams to a React frontend over WebSockets. What makes this interesting isn't the stack itself—it's how these components orchestrate a complex discovery pipeline.

The reconnaissance starts with 32+ curated Shodan queries targeting AI-specific signatures. Rather than generic port scans, these queries look for protocol-specific indicators: http.html:"gradio" for Gradio UIs, ssl.cert.subject.cn:"ollama" for Ollama instances, or http.title:"vLLM" for inference servers. Once Shodan returns candidates, AIMap validates them with httpx for liveness checks, then fingerprints each service using custom Nuclei templates designed specifically for AI protocols.

Here's where it gets sophisticated—the risk scoring engine. AIMap doesn't just detect services; it evaluates risk through protocol-aware heuristics:

# Simplified risk scoring logic for MCP servers
def calculate_mcp_risk(endpoint_data):
    risk_score = 0.0
    
    # Base risk for unauthenticated access
    if not endpoint_data.get('requires_auth'):
        risk_score += 4.0
    
    # Examine exposed tools for dangerous capabilities
    tools = endpoint_data.get('tools', [])
    dangerous_tools = ['execute_command', 'run_shell', 'file_write']
    
    for tool in tools:
        if any(dangerous in tool.lower() for dangerous in dangerous_tools):
            risk_score += 2.0
    
    # Check for tool chaining potential
    if len(tools) > 5 and 'read_file' in str(tools).lower():
        risk_score += 1.5  # Data exfiltration potential
    
    # CORS misconfigurations
    if endpoint_data.get('cors') == '*':
        risk_score += 1.0
    
    return min(risk_score, 10.0)

This risk model understands that an MCP server exposing execute_command without authentication is qualitatively different from a read-only Ollama model endpoint. The scoring combines multiple factors: authentication status, exposed capabilities, transport security, CORS policies, and protocol-specific threat patterns.

The attack testing framework is equally sophisticated. AIMap includes protocol-native attack modules that go beyond simple HTTP requests. For Ollama, it attempts model enumeration via /api/tags, tests for model extraction, and checks if you can push malicious models. For MCP, it interrogates the tool schema and attempts to invoke tools with crafted payloads. The system streams results in real-time through Redis Streams, allowing the frontend to display discoveries as they happen:

// React component subscribing to scan results
const useScanStream = (scanId) => {
  const [results, setResults] = useState([]);
  
  useEffect(() => {
    const ws = new WebSocket(`wss://api.aimap.io/stream/${scanId}`);
    
    ws.onmessage = (event) => {
      const finding = JSON.parse(event.data);
      setResults(prev => [...prev, {
        ...finding,
        risk: finding.risk_score,
        protocol: finding.detected_protocol,
        timestamp: new Date()
      }]);
    };
    
    return () => ws.close();
  }, [scanId]);
  
  return results;
};

For scaled scanning, AIMap optionally integrates with Modal for serverless execution. This is clever architecture—local development runs scans in-process, but production deployments can fan out across Modal's infrastructure to scan thousands of targets in parallel without managing infrastructure. The code transparently switches between local and distributed execution based on configuration.

The Nuclei integration deserves attention. AIMap ships with custom templates specifically for AI services that standard Nuclei template repositories don't cover. These templates understand protocol handshakes, API versioning, and framework-specific endpoints. For instance, detecting an LangServe deployment requires checking for the /docs OpenAPI endpoint, then validating it exposes LangChain-specific schemas—something generic web application templates wouldn't catch.

Gotcha

AIMap's biggest limitation is its dependency on Shodan API access, and not just any access—the free tier's 100 query credits won't get you far. Meaningful scans require a paid subscription, and even then you're constrained by Shodan's rate limits and query credit consumption. Each of those 32+ curated queries costs credits, and if you're running comprehensive scans regularly, costs add up quickly. For individual researchers or small teams without corporate Shodan accounts, this creates a significant barrier to entry.

The detection accuracy inherits Nuclei's limitations around custom deployments. If an organization runs Ollama on a non-standard port behind a reverse proxy with custom paths, AIMap's fingerprinting may miss it entirely. The tool assumes relatively standard deployments—services listening on default ports, exposing conventional API endpoints, responding to typical health checks. Organizations doing creative proxying or running AI services in non-standard configurations will evade detection. Additionally, the risk scoring is heuristic-based rather than exploitation-validated. A service might score 9.5/10 but actually have robust network-level controls that make exploitation impossible, while a 3.0/10 service might have subtle misconfigurations that lead to complete compromise. The scores provide excellent triage guidance but aren't substitutes for manual security assessment of high-value targets.

Verdict

Use AIMap if you're conducting authorized penetration tests of organizations with significant AI infrastructure, performing security research that requires internet-wide visibility into AI service exposure patterns, or running red team operations where discovering unknown AI attack surface is valuable. It excels at the reconnaissance and triage phases—taking you from zero knowledge to a prioritized list of potentially vulnerable AI services faster than any manual approach. Skip it if you're testing known individual targets where you already have URLs and protocols (just use Nuclei and protocol-specific tools directly), conducting unauthorized research (this is explicitly a pentesting tool requiring permission), working without Shodan API budget, or need deep post-exploitation capabilities beyond initial access testing. AIMap finds the doors; you still need other tools to walk through them.

AIMap: Scanning the Internet for Exposed AI Infrastructure Before Attackers Do

AIMap: Scanning the Internet for Exposed AI Infrastructure Before Attackers Do

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

AIMap: Scanning the Internet for Exposed AI Infrastructure Before Attackers Do

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]