Building API Documentation from Network Traffic: Inside Postman's Observability CLI
Hook
What if your API documentation wrote itself by watching production traffic? The Akita CLI does exactly that—using packet sniffing to reverse-engineer OpenAPI specs from the network layer, bypassing your application code entirely.
Context
API documentation drift is a silent killer. You launch with pristine OpenAPI specs, but six months later, the reality diverges: endpoints added during urgent hotfixes, query parameters that appeared mid-sprint, response schemas that evolved organically. Traditional solutions require developers to remember to update specs alongside code—a discipline that rarely survives contact with production incidents.
Postman's Observability CLI (originally Akita) takes a radically different approach: treat the network as the source of truth. Instead of instrumenting your code with SDKs or requiring developers to annotate endpoints, it sits at the packet level, passively watching HTTP/HTTPS traffic flow by. It reconstructs API behavior from actual usage patterns, generating OpenAPI specifications that reflect reality rather than outdated intentions. This zero-instrumentation model is particularly powerful for legacy systems, microservices architectures where maintaining consistent documentation across dozens of services becomes untenable, or when you're inheriting an undocumented codebase and need to understand what it actually does.
Technical Insight
The Akita CLI's architecture centers on libpcap-based packet capture through the gopacket library. When you run akita apidump, it opens a network interface in promiscuous mode, capturing raw TCP packets and reconstructing HTTP streams from fragmented data. This is fundamentally different from proxy-based tools like mitmproxy—there's no man-in-the-middle, no certificate management, and no application configuration changes required.
Here's a basic capture workflow:
# Capture traffic on localhost:8080 for 5 minutes
akita apidump --filter "port 8080" --out my-api-trace
# Generate OpenAPI spec from captured traffic
akita apispec --trace my-api-trace --out openapi.yaml
# Compare two traces to detect behavioral drift
akita apidiff --old baseline-trace --new current-trace
Under the hood, the CLI performs multi-stage processing. First, gopacket's TCP stream reassembly reconstructs full HTTP exchanges from packets that may arrive out of order or across multiple buffers. Then HTTP parsers extract method, path, headers, and body from both requests and responses. The tool identifies patterns across multiple requests to the same endpoint, clustering similar paths (/users/123 and /users/456 become /users/{id}) and inferring parameter types.
The OpenAPI generation logic is where it gets interesting. The CLI doesn't just dump raw observations—it performs statistical analysis across captured requests. If it sees /api/users?limit=10 fifty times with different numeric values for limit, it infers this is an integer query parameter. If response bodies consistently contain {"id": 123, "name": "John"} structures, it builds a schema definition. This means you need representative traffic volume; capturing three requests won't produce useful specs.
For HTTPS traffic, the situation is more nuanced:
# Option 1: Capture at your service before TLS termination
akita apidump --interfaces lo --filter "port 8080"
# Option 2: Provide TLS keys for decryption
akita apidump --filter "port 443" --tls-key-log /path/to/sslkeylog.txt
Most production deployments terminate TLS at a load balancer or reverse proxy, so capturing unencrypted traffic between the load balancer and your application service is the practical approach. The TLS key log option works for local development but requires configuring your application to export session keys, which isn't suitable for production.
The apidiff command showcases the tool's behavioral analysis capabilities. It compares two traffic captures and identifies changes: new endpoints, modified response schemas, altered query parameters. This is invaluable for pre-deployment validation—capture traffic in staging with the new version, compare against production baseline, and detect unintended API contract changes before they break clients.
One architectural decision worth noting: the CLI is designed as a data collector for Postman's cloud console, not a standalone analysis suite. While you can generate OpenAPI specs locally, much of the value—endpoint discovery dashboards, automatic alerting on schema changes, collaborative API catalogs—lives in the SaaS platform. The open-source repository handles packet capture and basic parsing, but advanced features like inferring complex data types (is this string field actually a UUID? An ISO date?) remain in the proprietary binary. This hybrid model means you're adopting both a tool and a platform ecosystem.
Gotcha
The packet capture approach introduces operational friction that proxy-based alternatives avoid. You need libpcap installed system-wide, which means apt-get install libpcap-dev or equivalent—simple on developer machines, more involved in containerized deployments. Docker containers don't have network interface access by default; you'll need --net=host or creative sidecar patterns, and Kubernetes requires privileged pods or specific capabilities (CAP_NET_RAW, CAP_NET_ADMIN). Many security teams rightfully balk at granting packet capture permissions in production.
The dual licensing model creates a capability gap. The open-source version can capture traffic and generate basic specs, but type inference is limited. It might identify a field as a string when you know it's specifically an email or timestamp. The proprietary binary includes the ML-based type detection that makes generated specs genuinely useful rather than just technically accurate. You're essentially getting a teaser of capabilities that require buying into the full Postman platform. For teams wanting self-hosted observability without external dependencies, this is a dealbreaker. Additionally, the cloud integration means your API traffic patterns—endpoints, parameters, potentially sensitive path information—leave your infrastructure, which requires careful evaluation against data governance policies.
Verdict
Use if: You're reverse-engineering undocumented APIs, validating that implementation matches specification across microservices, or need passive monitoring in development/staging environments where zero code changes is a hard requirement. It's particularly powerful when you control the deployment environment enough to handle packet capture permissions but don't want to modify application code. Skip if: You need fully self-hosted observability without SaaS dependencies, require production-grade monitoring in containerized environments where packet capture is operationally complex, or already have comprehensive API documentation workflows that your team actually maintains. If you need the advanced type inference features, evaluate whether you're comfortable committing to Postman's platform ecosystem long-term, not just adopting an open-source CLI.