Building on Notion's Hidden API: How notionapi Unlocks Headless CMS Capabilities

Hook

While Notion's official API took years to materialize and still can't handle nested blocks properly, thousands of developers have been quietly using a reverse-engineered alternative to build production sites, migrate content, and treat Notion as a true headless CMS.

Context

Notion exploded in popularity as a workspace tool, but for years it remained a walled garden with no programmatic access. Developers who wanted to use Notion as a content management system—to pull articles for blogs, generate documentation sites, or export data—had no official path forward. When Notion finally released their public API in 2021, it arrived with severe limitations: restricted access to certain block types, incomplete nested content support, and missing read capabilities for complex page hierarchies.

This gap spawned an ecosystem of unofficial clients that reverse-engineered Notion's internal endpoints—the same ones the web application uses. The kjk/notionapi library represents one of the most successful attempts in the Go ecosystem. Created by Krzysztof Kowalczyk, who needed to power his own blog and programming books platform from Notion content, this library doesn't wait for official features. It mimics authenticated browser requests to Notion's private API, parsing their internal block-based content model into usable Go structures. The result is a battle-tested tool that handles the full complexity of Notion pages, enabling use cases like static site generation that the official API still struggles with.

Technical Insight

At its core, notionapi works by reverse-engineering the API calls that Notion's web client makes. When you load a Notion page in your browser, the frontend makes requests to endpoints like /api/v3/loadPageChunk and /api/v3/getRecordValues. The library replicates these requests, handling authentication tokens and parsing the JSON responses into Go structs that represent Notion's block hierarchy.

The library's primary entry point is the Client type, which manages authentication and request handling. Here's a basic example of fetching a page and traversing its blocks:

import "github.com/kjk/notionapi"

client := &notionapi.Client{}

// Fetch a page by its ID (extracted from the Notion URL)
pageID := "7c3c0a5f8e5e4f9c9c7e9e9e9e9e9e9e"
page, err := client.DownloadPage(pageID)
if err != nil {
    log.Fatal(err)
}

// The page contains a flat array of blocks
for _, block := range page.BlockByID {
    switch block.Type {
    case notionapi.BlockText:
        fmt.Println("Text:", block.Title)
    case notionapi.BlockHeader:
        fmt.Println("Header:", block.Title)
    case notionapi.BlockCode:
        fmt.Println("Code:", block.Code)
        fmt.Println("Language:", block.CodeLanguage)
    case notionapi.BlockImage:
        fmt.Println("Image URL:", block.ImageURL)
    }
}

What makes this library particularly powerful is how it handles Notion's block-based content model. Everything in Notion is a block—paragraphs, headers, images, databases, even pages themselves. Blocks can be nested arbitrarily deep, creating parent-child relationships. The library exposes this as a graph structure where each block can have Content (child block IDs) that you traverse recursively.

The architecture choice to return blocks as a flat map (BlockByID) rather than a pre-built tree is deliberate. It gives you flexibility to traverse the content hierarchy however your application needs. For a static site generator, you might walk the tree depth-first to render HTML. For a content migration tool, you might flatten everything to markdown. The library doesn't impose those opinions.

A more advanced pattern is using the toHTML functionality that the author built on top of the core client. This demonstrates the real-world application—converting Notion's proprietary block format into renderable HTML:

// Create a converter that knows how to render blocks to HTML
converter := notionapi.NewHTMLConverter(page)

// Walk the page and generate HTML
html, err := converter.ToHTML()
if err != nil {
    log.Fatal(err)
}

// The result is semantic HTML with proper heading hierarchy,
// code blocks with syntax highlighting classes, images, etc.
ioutil.WriteFile("output.html", []byte(html), 0644)

The HTML converter handles the complexities of Notion's rich text formatting—inline code, bold, italics, links, mentions—and maps them to appropriate HTML tags. It also manages block-level elements like numbered lists, bullet lists, toggle lists, and callout boxes.

One sophisticated aspect is how the library handles authentication. Unlike the official API that uses integration tokens, notionapi requires a token_v2 cookie extracted from an authenticated browser session. This is the session token Notion uses for its web app, making the library indistinguishable from a real user browsing the site:

client := &notionapi.Client{
    AuthToken: "your_token_v2_cookie_value",
}

This approach is both the library's strength and weakness. It provides full access to everything the web interface can do, but it also means you're bound to Notion's terms of service for automated access, and your authentication can expire like any user session.

The codebase itself is relatively compact—under 5,000 lines—because it focuses on the core problem: making requests and parsing responses. The author has documented the reverse-engineering process, including how to inspect network traffic to discover new endpoints as Notion evolves. This transparency makes the library more maintainable when Notion inevitably changes their internal API.

Gotcha

The elephant in the room is sustainability. This library is reverse-engineered from private API endpoints that Notion never intended for public consumption. When Notion updates their backend—which they do regularly—the library can break without warning. There's no deprecation notice, no migration guide, just sudden failures in production. The author has been responsive to issues historically, but you're ultimately depending on community maintenance and goodwill.

Write operations are explicitly limited compared to read capabilities. While you can fetch and parse virtually any Notion page structure, creating or modifying content is more constrained. The library supports some write operations like creating blocks and updating properties, but it's not comprehensive. If your use case requires bidirectional sync or complex content updates, you'll find yourself in unsupported territory.

Authentication is another friction point. Extracting the token_v2 cookie from your browser isn't developer-friendly, and these tokens can expire, requiring manual renewal. For production systems, you'll need to implement token refresh logic or accept that your automation might require periodic manual intervention. The official API's integration tokens are far more suitable for long-running services.

Verdict

Use if: You need deep read access to Notion content that the official API doesn't provide—especially nested blocks, complex page hierarchies, or specific block types that aren't yet supported officially. This library excels for static site generation, content migration projects, or treating Notion as a headless CMS where you pull content periodically and render it elsewhere. The author's production usage for his blog and books platform demonstrates it's stable enough for real projects. Skip if: You're building mission-critical systems that can't tolerate API breakage, need comprehensive write operations, or want vendor support and stability guarantees. For straightforward integrations like creating database entries or simple page updates, the official API is now mature enough and offers the peace of mind of being supported. Also skip if you're just starting a new project and can work within the official API's constraints—better to avoid the technical debt of an unofficial library unless you genuinely need capabilities it doesn't provide.

Building on Notion's Hidden API: How notionapi Unlocks Headless CMS Capabilities

Building on Notion's Hidden API: How notionapi Unlocks Headless CMS Capabilities

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Building on Notion's Hidden API: How notionapi Unlocks Headless CMS Capabilities

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]