ToolHive: Running Model Context Protocol Servers in Production Without Losing Your Mind
Hook
The Model Context Protocol promised to standardize how AI assistants access your tools and data, but nobody told you how to actually run those servers in production with security controls, audit logs, and SSO integration.
Context
The Model Context Protocol standardized how AI coding assistants integrate tools and context, replacing proprietary approaches with a simple protocol where ‘servers’ expose tools, prompts, and resources that any MCP ‘client’ (like Claude Desktop or Cursor) can consume.
But the initial ecosystem focused entirely on the developer getting-started experience—npm packages you run directly, Python scripts executed from your shell, configuration files with plaintext secrets. This works fine for solo experimentation, but enterprises immediately hit walls: How do you audit which tools each team is using? How do you enforce least-privilege access when MCP servers can request arbitrary filesystem or network permissions? How do you deploy these at scale without manually SSH-ing into machines? ToolHive, built by Stacklok, positions itself as a production-grade answer to these questions—an opinionated platform that treats MCP servers as container workloads with full lifecycle management.
Technical Insight
ToolHive’s architecture revolves around four components that work together to bridge the gap between local development and enterprise deployment. At the foundation is the Runtime, which ensures every MCP server runs in an isolated container (Docker/Podman locally, Kubernetes in production) with explicitly granted permissions. This isn’t just security theater—MCP servers written by third parties might request filesystem access, network calls, or environment variables containing secrets. By containerizing each server and requiring declarative permission grants, ToolHive gives you the same isolation guarantees you’d expect from modern cloud-native applications.
The Gateway acts as a reverse proxy and policy enforcement point for all MCP traffic. Instead of AI clients connecting directly to individual MCP servers, they connect to the Gateway, which handles authentication (including OIDC/OAuth integration with enterprise identity providers), authorization, and routing. The Gateway can also orchestrate multiple MCP servers into a single virtual endpoint—imagine combining a database query server, a Jira integration server, and a web search server into one unified toolset that appears as a single MCP server to your AI client. Crucially, the Gateway can filter and customize tool descriptions before sending them to the AI client, reducing token usage by omitting irrelevant tools based on context.
Here’s what a basic ToolHive deployment appears to look like using the CLI to run an MCP server locally:
# Install the ToolHive CLI (download from toolhive.dev/download)
# Run an MCP server from the curated registry
toolhive run <server-name>
# Or run a custom server from a container image
toolhive run --image ghcr.io/your-org/custom-mcp:latest \
--secret API_KEY=env:MY_API_KEY \
--allow-network api.example.com
# List running servers
toolhive list
# ToolHive auto-configures popular AI clients like Cursor
The same conceptual model scales to Kubernetes with the ToolHive Operator, which introduces custom resource definitions for declarative management:
apiVersion: toolhive.stacklok.com/v1alpha1
kind: MCPServer
metadata:
name: github-integration
namespace: engineering-tools
spec:
image: ghcr.io/stacklok/mcp-github:latest
permissions:
network:
- api.github.com
secrets:
- name: github-token
key: token
resources:
limits:
memory: 256Mi
cpu: 200m
This enables GitOps workflows where your infrastructure team defines which MCP servers are available, what permissions they have, and how they’re deployed—all in version-controlled YAML.
The Registry Server (maintained in a separate repository) curates a catalog of trusted MCP servers. It integrates with the official MCP registry but adds organizational customization: you can group servers by team or use case, preset configurations like API endpoints or permission scopes, and verify provenance through signature checking. This solves the discovery problem—instead of developers hunting GitHub for MCP servers and copy-pasting npm commands from README files, they browse a curated catalog with pre-configured security guardrails.
Finally, the Portal provides both a desktop app and web UI for non-technical users. Knowledge workers who want to use AI tools don’t need to understand Docker commands or YAML—they click “Install” on a server from the catalog, and ToolHive handles the container orchestration, secret management, and client configuration. The Portal auto-configures popular clients like Claude Desktop, Cursor, and VS Code Server.
What makes this architecture powerful is the hybrid deployment model: the same components work locally for individual developers and in Kubernetes for enterprise deployments. A developer tests an MCP server integration locally using the CLI, then the infrastructure team deploys that exact configuration to production using the Kubernetes Operator. The Gateway’s policy engine ensures consistent security controls whether you’re running on a laptop or a hundred-node cluster.
Gotcha
The biggest limitation is that ToolHive is solving a problem for a protocol that’s still relatively new. The MCP ecosystem is still developing—while the official registry includes MCP servers, the maturity levels vary. ToolHive itself is under active development, with some components (like the Registry Server, desktop Studio app, and cloud UI) living in separate repositories with their own development cycles.
The infrastructure requirements are non-trivial for what might seem like simple tooling. Running ToolHive locally requires Docker or Podman (another daemon to manage and secure), and the enterprise deployment assumes you already have Kubernetes infrastructure. If you’re a solo developer just wanting to try MCP with Claude Desktop, the official client configuration (a simple JSON file pointing to an npm package) is far simpler. ToolHive’s value proposition is enterprise-scale management and security—it’s overkill for hobbyist use cases. The observability stack (OpenTelemetry and Prometheus integration) also assumes familiarity with these tools; if your organization isn’t already running a metrics pipeline, you’re adding operational complexity alongside ToolHive itself.
Verdict
Use ToolHive if you’re deploying MCP servers for multiple teams in an organization that needs audit trails, SSO integration, policy enforcement, and centralized management. It’s particularly valuable if you’re already running Kubernetes and want to treat AI tool integrations with the same GitOps discipline as other infrastructure. The platform shines when you need to answer questions like “Which teams are using which external APIs through MCP?” or “How do we rotate credentials for MCP servers without touching individual developer machines?” Skip it if you’re a solo developer experimenting with MCP, working in a small team without compliance requirements, or just want to connect Claude Desktop to a couple of npm-based servers. The overhead of running containers, configuring the Gateway, and managing the Registry isn’t justified when a simple JSON configuration file suffices. Also skip it if you’re committed to non-MCP AI tool frameworks like LangChain or if your organization can’t support the operational complexity of another distributed system.