Back to Articles

Building Security-Aware AI Assistants with VirusTotal and the Model Context Protocol

[ View on GitHub ]

Building Security-Aware AI Assistants with VirusTotal and the Model Context Protocol

Hook

What if your AI assistant could instantly check whether that suspicious link is malware, query file hashes against 70+ antivirus engines, and pivot through domain relationships—all through natural language conversation?

Context

The rise of AI coding assistants like Claude, ChatGPT, and Cursor has transformed how developers work, but these tools remain isolated from critical security infrastructure. When analyzing potentially malicious code, investigating suspicious domains, or performing threat intelligence research, developers still context-switch between their AI assistant and external security tools like VirusTotal. This friction is particularly painful during incident response or security reviews, where time matters and maintaining investigation context is crucial.

The Model Context Protocol (MCP), introduced by Anthropic, aims to solve this by providing a standardized way for AI assistants to interact with external data sources and tools. Think of it as a universal adapter: instead of every AI assistant needing custom integrations with every security tool, MCP servers act as translators that expose tool capabilities in a format any MCP-compatible client can understand. The virustotal-mcp project implements this bridge specifically for VirusTotal, enabling AI assistants to perform security analysis queries without leaving the conversation interface.

Technical Insight

At its core, virustotal-mcp is an async Python server that maps MCP tool calls to VirusTotal API endpoints. The architecture uses aiohttp for non-blocking HTTP requests and implements the MCP server protocol to expose VirusTotal's capabilities as callable tools. The interesting design decision here is the two-tier approach: high-level "report" tools that automatically fetch related entities for quick overviews, and low-level "relationship" tools that offer granular control with pagination.

Here's how the tool registration works for the URL analysis endpoint:

@mcp.tool()
async def get_url_report(url: str) -> dict:
    """Get comprehensive VirusTotal report for a URL
    
    Args:
        url: The URL to analyze
        
    Returns:
        Full analysis including scan results and relationships
    """
    # Encode URL as VirusTotal identifier
    url_id = base64.urlsafe_b64encode(url.encode()).decode().strip("=")
    
    # Fetch base report
    async with aiohttp.ClientSession() as session:
        headers = {"x-apikey": VIRUSTOTAL_API_KEY}
        async with session.get(
            f"https://www.virustotal.com/api/v3/urls/{url_id}",
            headers=headers
        ) as response:
            report = await response.json()
    
    # Auto-fetch related entities
    relationships = await fetch_url_relationships(url_id)
    report["relationships_summary"] = relationships
    
    return report

The @mcp.tool() decorator registers this function as an MCP tool, making it discoverable and callable by any MCP client. The AI assistant sees this as a function it can invoke when a user asks about URL safety. What's clever is the automatic relationship fetching—when you ask about a URL, you immediately get context about communicating files, downloaded samples, and associated domains without making separate queries.

The relationship tools take a different approach for power users who need more control:

@mcp.tool()
async def get_url_relationships(
    url: str,
    relationship: str,
    limit: int = 10,
    cursor: Optional[str] = None
) -> dict:
    """Get specific relationship data with pagination
    
    Args:
        url: The URL to query
        relationship: Type (e.g., 'communicating_files', 'redirects_to')
        limit: Number of results (max 40)
        cursor: Pagination cursor for next page
    """
    url_id = base64.urlsafe_b64encode(url.encode()).decode().strip("=")
    
    params = {"limit": min(limit, 40)}
    if cursor:
        params["cursor"] = cursor
    
    async with aiohttp.ClientSession() as session:
        headers = {"x-apikey": VIRUSTOTAL_API_KEY}
        async with session.get(
            f"https://www.virustotal.com/api/v3/urls/{url_id}/{relationship}",
            headers=headers,
            params=params
        ) as response:
            return await response.json()

This pagination support is critical when dealing with popular domains that might have thousands of associated samples. The cursor-based pagination lets you incrementally explore large result sets without overwhelming the API or the AI's context window.

The project supports all major VirusTotal entity types—files (by hash), URLs, IP addresses, and domains—with relationship mapping across 50+ relationship types. For files, you can query execution parents, bundled files, contacted domains, and more. For domains, you can explore subdomains, resolutions, communicating files, and referrer files. This comprehensive coverage means an AI assistant can perform multi-hop threat intelligence queries: "Check this file hash, then analyze all domains it contacted, then show me other files communicating with those domains."

Perhaps the most powerful feature is VT Intelligence search integration, which exposes VirusTotal's advanced query language through the MCP interface. Security researchers can use complex queries like type:peexe size:90kb+ positives:5+ to find specific malware samples, and the AI assistant handles the API interaction, result parsing, and follow-up queries based on what it finds.

Gotcha

The documentation gaps are significant and will frustrate anyone trying to deploy this beyond local experimentation. The Quick Start section is literally marked "TBD," manual installation instructions are incomplete, and the requirements section cuts off mid-sentence. For a security tool that requires API key configuration and will be querying threat intelligence data, this lack of operational guidance is concerning. You'll need to read the source code to understand configuration options, error handling behavior, and rate limiting strategies.

More critically, the project doesn't document how it handles VirusTotal API rate limits, which are aggressive on free-tier keys (4 requests per minute, 500 per day). There's no visible retry logic, exponential backoff, or quota management in the code examples. In practice, this means your AI assistant might start failing mid-investigation when limits are hit, with no graceful degradation. The server also doesn't appear to cache responses, so repeated queries for the same entity will burn through your API quota quickly. Production use would require forking the project and adding these reliability layers yourself—feasible for experienced developers, but not turnkey.

Verdict

Use if: You're building MCP-based applications that need VirusTotal integration and you're comfortable reading source code to fill documentation gaps. This is particularly compelling for Claude Desktop users who want security analysis in their AI workflow and have VirusTotal API access with reasonable quota limits. The two-tier tool design is genuinely thoughtful for balancing quick queries and deep investigations. Skip if: You need production-grade reliability, comprehensive error handling, or high-volume query support. The project's early-stage maturity (3 stars, incomplete docs) means you'll be doing significant wrapper development yourself. For direct API integration without MCP, use the official vt-py client instead. If you're building production security tooling, you'll want to fork this as a starting point and add caching, rate limiting, and proper error handling before deployment.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/ai-agents/emeryray2002-virustotal-mcp.svg)](https://starlog.is/api/badge-click/ai-agents/emeryray2002-virustotal-mcp)