Teaching AI to Read API Docs: Inside OpenAPI MCP Server's Progressive Disclosure Pattern
Hook
Large language models can write code for APIs they've never seen before, but hand them a 50,000-line OpenAPI spec and they'll hallucinate endpoints that don't exist. The problem isn't intelligence—it's information overload.
Context
Anyone who's asked Claude or GPT-4 to help integrate a third-party API knows the dance: you paste documentation, the model confidently suggests endpoints, and half of them are wrong or deprecated. The issue compounds with modern APIs—Stripe's OpenAPI spec weighs in at over 400KB, Twilio's approaches 1MB. Even with expanding context windows, dumping entire specifications into an LLM chat leads to confusion, hallucinations, and wasted tokens.
The traditional solution has been to manually curate the relevant parts, copying specific endpoint documentation into your prompts. This works but defeats the purpose of having an AI assistant—you're doing the discovery work yourself. What developers needed was a way for AI to explore APIs the way humans do: start broad, narrow down to what's relevant, and drill into details only when necessary. OpenAPI MCP Server implements this pattern by creating a conversational interface between AI assistants and API specifications, using the Model Context Protocol to expose API discovery as a set of tool calls the LLM can orchestrate itself.
Technical Insight
The architecture revolves around a three-tier progressive disclosure system that mirrors how experienced developers actually learn new APIs. Instead of loading everything at once, the MCP server exposes three distinct tools that the LLM can call sequentially.
First, the find_openapi_by_name tool queries the oapis.org registry—a crowdsourced collection of OpenAPI specifications for popular services. When you ask Claude "How do I create a Stripe customer?", the LLM first calls this tool to locate the correct spec. The registry search supports fuzzy matching, so "stripe," "Stripe API," or even "payment processing stripe" will find the same specification. This eliminates the need for developers to track down and provide exact spec URLs.
Once identified, the get_openapi_summary tool retrieves a radically simplified version of the specification. Here's where the "simple language" translation happens. Instead of returning the raw OpenAPI JSON with its nested schemas, $ref pointers, and technical jargon, the server transforms it into a high-level overview:
{
"api_name": "Stripe API",
"summary": "Payment processing with support for subscriptions, invoices, and customer management",
"available_endpoints": [
{
"path": "/v1/customers",
"methods": ["GET", "POST"],
"purpose": "List or create customer records"
},
{
"path": "/v1/customers/{id}",
"methods": ["GET", "POST", "DELETE"],
"purpose": "Retrieve, update, or delete a specific customer"
}
]
}
This compressed representation gives the LLM enough information to reason about which endpoint is relevant without burning thousands of tokens on schema definitions it doesn't need yet. The model can scan dozens of endpoints quickly and identify the one or two that matter for the task at hand.
Finally, when the LLM determines it needs detailed information about a specific endpoint—say, what parameters the POST /v1/customers endpoint accepts—it calls get_endpoint_details with the exact path and method. Only then does the server return the full parameter schemas, validation rules, response structures, and authentication requirements. This lazy-loading approach means the LLM never processes more OpenAPI schema than absolutely necessary for the immediate question.
The implementation leverages Cloudflare Workers for deployment, which provides several architectural advantages beyond simple hosting. Workers run at the edge, meaning API spec lookups happen with minimal latency regardless of where the developer is located. The serverless model also means there's no infrastructure to manage—the MCP server scales automatically with usage. Here's a simplified version of how the endpoint detail fetching might be implemented:
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { method, path } = await request.json();
// Fetch full spec from oapis.org
const spec = await fetchOpenAPISpec(specId);
// Extract just the requested endpoint
const endpoint = spec.paths[path]?.[method.toLowerCase()];
if (!endpoint) {
return new Response(JSON.stringify({
error: "Endpoint not found"
}), { status: 404 });
}
// Transform technical OpenAPI to simple language
const simplified = {
description: endpoint.description,
parameters: endpoint.parameters?.map(p => ({
name: p.name,
required: p.required,
type: simplifySchema(p.schema),
explanation: p.description
})),
authentication: extractAuthRequirements(spec, endpoint),
example_request: generateCurlExample(path, method, endpoint)
};
return new Response(JSON.stringify(simplified));
}
};
The Model Context Protocol integration is what makes this genuinely useful rather than just another API proxy. MCP defines a standard way for AI assistants to discover and invoke tools. When you configure OpenAPI MCP Server in Claude Desktop or Cursor, these editors automatically detect the available tools and understand how to call them. The LLM doesn't need fine-tuning or special prompting—it naturally learns to use the three-step discovery pattern because the tool definitions explain what each function does.
What makes this approach particularly clever is that it offloads the complexity of API exploration to a system designed for it (the MCP server) while keeping the LLM focused on what it does best: understanding user intent and generating code. The model isn't trying to parse OpenAPI schemas while simultaneously writing TypeScript—it's asking a specialized tool for the information it needs, then using that clean data to generate accurate code.
Gotcha
The entire system has a single point of failure: oapis.org. If an API isn't registered in their directory, OpenAPI MCP Server can't help you. This is fine for popular public APIs like Stripe, Twilio, GitHub, and Shopify—they're all there. But if you're working with a corporate internal API, a newer startup's offering, or a specialized industry service, you're out of luck. The server doesn't currently support providing your own OpenAPI spec URLs directly, which feels like an obvious missing feature for enterprise adoption.
The "simple language" translation is also a double-edged sword. While it helps LLMs avoid getting lost in nested schema definitions, it necessarily abstracts away details that might matter. Subtle validation rules, format constraints, or edge cases documented in the original OpenAPI spec can get lost in translation. I've seen cases where the simplified summary indicated a parameter was optional, but the actual API rejected requests without it due to conditional validation rules that didn't survive the simplification process. For exploratory work and common use cases, this isn't a problem. But if you're implementing something complex or dealing with unusual API behavior, you'll still need to reference the original documentation.
Client compatibility is currently limited to tools that support MCP—primarily Claude Desktop and Cursor. If your team uses VS Code with GitHub Copilot, WebStorm with built-in AI, or any other development environment, you can't use this. The MCP ecosystem is growing, but it's still early days, and betting on it means accepting that your toolchain options are constrained.
Verdict
Use if: You're working in Claude Desktop or Cursor and regularly integrate third-party APIs from well-known providers. The progressive disclosure pattern genuinely improves how AI assistants help with API exploration, and the time saved on discovery and documentation reading adds up quickly. It's particularly valuable when prototyping or learning a new API where you're not sure which endpoints you need yet—letting the LLM navigate the spec conversationally is significantly faster than manual documentation diving. Skip if: You need to work with private APIs, internal corporate services, or anything not in the oapis.org registry. Also skip if you're using development tools outside the MCP ecosystem, or if your API integration work involves complex edge cases where you can't afford to lose the nuance of the full OpenAPI specification. For those scenarios, you're better off with traditional tools like Postman or direct OpenAPI spec reading until MCP support broadens and custom spec support gets added.