Steel Browser: The Open-Source Browser API That Lets AI Agents See the Web
Hook
Most AI agents fail at web automation not because of intelligence, but infrastructure: managing browser sessions, rotating proxies, and bypassing bot detection requires more engineering than the actual automation logic.
Context
If you've built an AI agent that needs to interact with websites, you've likely hit the infrastructure wall. Tools like Puppeteer and Playwright handle browser automation, but they leave you to solve the hard problems: managing long-running sessions across API calls, persisting authentication state, rotating proxies, defeating bot detection, and debugging headless browsers when things go wrong. Commercial services like Browserless solve these problems but lock you into proprietary platforms with usage-based pricing that scales painfully.
Steel Browser emerged from this gap—a self-hosted, open-source browser automation API designed specifically for AI agents and applications. It wraps Chrome with Puppeteer and Chrome DevTools Protocol (CDP) behind a RESTful API, handling session lifecycle, state persistence, and anti-detection out of the box. Instead of wrestling with infrastructure, developers can focus on teaching their agents to navigate websites. The project positions itself as "batteries-included," meaning the tedious operational concerns that typically consume weeks of development time are solved by default.
Technical Insight
Steel's architecture revolves around persistent browser sessions that behave like long-lived resources rather than ephemeral processes. When you create a session via the REST API, Steel spins up a dedicated Chrome instance, assigns it a unique identifier, and manages its lifecycle independently. This design maps naturally to AI agent workflows where a single task might require multiple API calls—logging into a site, navigating several pages, filling forms, and extracting data—without losing authentication state or context between steps.
Here's how you create and interact with a session:
// Create a persistent session
const session = await fetch('http://localhost:3000/sessions', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
useProxy: true,
userAgent: 'custom-agent',
sessionTimeout: 3600000 // 1 hour
})
});
const { sessionId, cdpUrl } = await session.json();
// Option 1: Use CDP directly with Puppeteer
import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({ browserWSEndpoint: cdpUrl });
const page = await browser.newPage();
await page.goto('https://example.com');
// Option 2: Use Steel's quick actions API
const screenshot = await fetch(`http://localhost:3000/sessions/${sessionId}/screenshot`, {
method: 'POST',
body: JSON.stringify({ url: 'https://example.com', fullPage: true })
});
// Option 3: Get page content as markdown for LLM processing
const markdown = await fetch(`http://localhost:3000/sessions/${sessionId}/markdown`, {
method: 'POST',
body: JSON.stringify({ url: 'https://example.com' })
});
The multi-protocol approach is Steel's killer feature for AI applications. Your agent can use high-level APIs like /markdown to quickly extract structured content for LLM consumption, then drop down to full Puppeteer control when it needs to handle complex interactions like multi-step forms or single-page apps. This flexibility means you're not locked into Steel's abstractions—if a quick action doesn't cover your use case, you still have the full power of CDP.
Steel's anti-detection layer is particularly relevant for AI agents scraping modern websites. It bundles puppeteer-extra-plugin-stealth and configures Chrome flags to minimize detection signals. The system randomizes viewport sizes, manages WebGL fingerprints, and handles navigator properties that sites check for bot behavior. Combined with proxy rotation (Steel manages proxy chains automatically if you provide credentials), this makes sessions significantly harder to flag than vanilla Puppeteer instances.
The debugging experience deserves attention. Steel exposes a Chrome DevTools endpoint on port 9223 and includes a web UI for inspecting active sessions. When an AI agent's web interaction fails—and they will fail—you can connect Chrome DevTools to the live session, watch it navigate in real-time, and inspect the exact DOM state your code is interacting with. This visibility is transformative compared to debugging headless automation through console logs.
Session state persistence works through explicit save/load mechanisms. You can serialize a session's cookies, localStorage, and even page state, then restore it later:
// Save session state
const state = await fetch(`http://localhost:3000/sessions/${sessionId}/export`);
const stateData = await state.json();
// Later, restore to a new session
const newSession = await fetch('http://localhost:3000/sessions', {
method: 'POST',
body: JSON.stringify({ state: stateData })
});
This enables patterns like pre-warming authenticated sessions, running parallel tasks with the same logged-in state, or recovering from crashes without re-authenticating. For AI agents handling workflows like "monitor this dashboard and alert on changes," persistent sessions mean you authenticate once and maintain access across invocations.
Gotcha
Steel is in public beta, and it shows. The API surface is stabilizing but not stable—GitHub issues reveal occasional breaking changes between minor versions. Pin your dependencies aggressively and test updates in staging before production. The documentation, while improving, still has gaps around advanced CDP usage and error handling patterns. You'll spend time reading source code to understand behavior that should be documented.
Resource consumption is the bigger concern. Each Steel session runs a full Chrome instance with its own V8 heap and renderer process. A modest workload of 10 concurrent sessions can easily consume 4-8GB RAM. This isn't Steel's fault—it's inherent to browser automation—but it means you'll hit scaling limits faster than with HTTP-based scraping. The Docker deployment helps with resource isolation, but if you're building an agent that needs to scrape hundreds of sites concurrently, you'll need serious infrastructure or a hybrid approach where Steel handles complex interactions and lighter tools handle simple data extraction. Bot detection is an arms race; Steel's anti-detection works well against standard checks but sophisticated systems (Cloudflare Turnstile, PerimeterX) can still detect automation. No tool wins this battle permanently.
Verdict
Use if: You're building AI agents or automation tools that need to interact with modern, JavaScript-heavy websites, you want session management and anti-detection without building custom infrastructure, you value being able to debug browser automation visually, or you need to avoid vendor lock-in to commercial browser services while still getting managed session lifecycle. Steel shines when your agents need to handle authenticated workflows, navigate SPAs, or maintain state across multiple operations. Skip if: You're doing simple HTTP scraping (use httpx or curl), your automation needs are basic enough that raw Puppeteer suffices and you have the expertise to manage it, you require battle-tested enterprise stability (this is beta software), or you need to run hundreds of concurrent sessions on limited infrastructure (the memory footprint will crush you). Also skip if you're targeting sites with advanced bot detection and can't afford the cat-and-mouse game—sometimes the answer is partnering with the site, not automating around them.