Back to Articles

Building Custom ChatGPT Knowledge Bases: Inside OpenAI's Official Retrieval Plugin

[ View on GitHub ]

Building Custom ChatGPT Knowledge Bases: Inside OpenAI’s Official Retrieval Plugin

Hook

OpenAI’s retrieval plugin supports 16 different vector databases—yet ChatGPT already has native file upload. When does this extra complexity actually pay off?

Context

ChatGPT launched with a critical limitation: it couldn’t remember your documents across conversations or search through organizational knowledge bases. While OpenAI eventually added native file upload to ChatGPT and the Assistants API, these built-in solutions offer no control over how documents are chunked, embedded, or retrieved. The ChatGPT Retrieval Plugin emerged as OpenAI’s official reference implementation for developers who need granular control over their retrieval pipeline—choosing specific embedding models, configuring chunk sizes, or deploying on-premises vector databases. Originally designed for the now-deprecated ChatGPT Plugins ecosystem, the project has evolved into a production-ready backend for Custom GPTs and function calling. It’s positioned at the intersection of simplicity and control: more configurable than native ChatGPT file upload, but more focused than general-purpose RAG frameworks like LangChain.

Technical Insight

Upload/Query Request

Verified

Text Segments

Vector Embeddings

Query Text

Query Vector

Similarity Results

Documents

Optional Processing

Enriched Metadata

Client/ChatGPT

FastAPI Server

Authentication Layer

Document Chunker

OpenAI Embedding API

Vector Database

Pinecone/Weaviate/Qdrant

Services Layer

PII/Metadata/Webhooks

System architecture — auto-generated

The plugin’s architecture centers on a FastAPI server that exposes API endpoints for document ingestion and retrieval. The core workflow involves three stages: chunking text into smaller segments, generating vector embeddings via OpenAI’s API, and storing those vectors in a database backend. What distinguishes this implementation is its datastore abstraction layer—a unified interface that lets you swap between Pinecone, Weaviate, Qdrant, Redis, Postgres, and other vector databases (16 total supported) without changing application code.

The embedding configuration demonstrates the flexibility. OpenAI’s newer text-embedding-3-large model supports variable dimensions, letting you trade accuracy for cost:

export EMBEDDING_MODEL=text-embedding-3-large
export EMBEDDING_DIMENSION=256  # Smaller dimension = lower cost
# Full quality uses 3072 dimensions

Document ingestion and retrieval are handled through API endpoints. The chunking service splits documents using configurable strategies—you can define chunk sizes and overlap to balance context preservation with retrieval precision. The repository includes scripts for processing and uploading documents from different data sources.

The repository structure reflects production-grade concerns. The /services directory includes PII detection, metadata extraction for enriching document context, and webhook support for triggering external workflows on document updates. Authentication supports both simple bearer tokens and OAuth flows, with examples in /examples/authentication. The /datastore adapters follow a consistent interface, making it straightforward to extend support to additional vector databases—each adapter implements standard methods with provider-specific optimization.

One clever design choice: the plugin separates the embedding generation from storage. This means you can use Azure OpenAI embeddings while storing vectors in an on-premises Postgres instance with pgvector, maintaining data sovereignty while leveraging cloud embedding models. The configuration uses environment variables to toggle between OpenAI and Azure endpoints without code changes:

export OPENAI_API_BASE=https://your-azure-instance.openai.azure.com/
export OPENAI_API_TYPE=azure
export OPENAI_EMBEDDINGMODEL_DEPLOYMENTID=your-deployment-name

The /scripts directory provides preprocessing utilities for common data sources, handling documents that arrive as PDFs, Word files, or web pages.

Gotcha

The plugin’s focused scope creates constraints. Document format support appears to require preprocessing—the /scripts directory provides utilities for processing documents from different data sources, suggesting text extraction happens before ingestion. For organizations with varied content types, this likely means building a separate ingestion pipeline.

The ChatGPT Plugins integration that originally motivated this project is deprecated. While Custom GPTs and function calling are stable, the plugin’s value proposition has narrowed. Native ChatGPT file upload now handles simple use cases, leaving the retrieval plugin relevant primarily for scenarios requiring database choice, embedding customization, or on-premises deployment—use cases that might also justify a more comprehensive framework.

Verdict

Use this plugin if you’re building ChatGPT-integrated applications where you need specific control over the retrieval pipeline—choosing your vector database for compliance reasons, tuning chunk sizes for your document types, or using Azure OpenAI embeddings with on-premises storage. It’s particularly valuable as a reference implementation, showing OpenAI’s recommended patterns for semantic search integration. The broad database support (16 providers) prevents vendor lock-in while maintaining production-ready authentication and webhook capabilities. Skip it if you’re building standalone RAG applications without ChatGPT integration (LangChain or LlamaIndex offer richer ecosystems), if native ChatGPT file upload meets your needs (simpler and free), or if you need capabilities beyond what the plugin’s focused scope provides. This is OpenAI’s official blueprint for retrieval backends—excellent for learning their patterns or when those patterns match your constraints.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/openai-chatgpt-retrieval-plugin.svg)](https://starlog.is/api/badge-click/developer-tools/openai-chatgpt-retrieval-plugin)