Back to Articles

Scira: Building a Self-Hosted Perplexity with 28 Tools and Agentic Planning

[ View on GitHub ]

Scira: Building a Self-Hosted Perplexity with 28 Tools and Agentic Planning

Hook

While Perplexity raised $500M to build a closed AI search engine, one developer created an open-source alternative that supports dozens of models, 17 specialized search modes, and runs on your own infrastructure.

Context

AI-powered search engines like Perplexity have proven that LLM-augmented web search can deliver better results than traditional keyword matching. But they come with a catch: you’re locked into their model choices, their data sources, and their interpretation of what constitutes a good answer. If you disagree with a citation, want to use a different LLM provider, or need to search specialized sources like academic papers or prediction markets, you’re out of luck.

Scira (formerly MiniPerplx) emerged as an open-source answer to this problem. Built on the Vercel AI SDK and licensed under AGPL-3.0, it’s a Next.js application that gives you the full Perplexity experience—AI-planned research, grounded retrieval, inline citations—but with complete control over your model providers, data sources, and deployment environment. With 11,550 stars on GitHub and backing from Vercel’s OSS program, it’s become the de facto self-hosted alternative for developers who want AI search without vendor lock-in.

Technical Insight

Decomposes query

Merged results

Rate limiting & cache

Citations

Selects models

Routes tasks

Parallel searches

Analysis

Tool Suite

Exa Semantic Search

Tavily Real-time

Firecrawl Extraction

Supermemory Context

LLM Providers

xAI Grok

OpenAI GPT/o3/o4

Claude 4.5/4.6

Gemini 2.5/3.x

User Query

Agentic Planning Layer

Vercel AI SDK Orchestrator

Deduplication Engine

Response Synthesis

Upstash Redis

Final Answer

System architecture — auto-generated

At its core, Scira implements an agentic planning architecture that transforms user queries into multi-step research workflows. When you ask a complex question, the system doesn’t just fire off a single search and call it done. Instead, it breaks your question into sub-tasks, selects the optimal models and tools for each step, executes searches in parallel with deduplication, and synthesizes responses with cross-verified citations.

The architecture leverages the Vercel AI SDK as its orchestration layer, which provides a unified interface to multiple model providers. This includes xAI’s Grok models (including Grok 3, Grok 4, and specialized variants like Grok Code), OpenAI’s GPT 4.1 and GPT 5.x series plus o3 and o4 mini for reasoning, Anthropic’s Claude 4.5 and 4.6 models for analysis, Google’s Gemini 2.5 and 3.x for multimodal tasks, and specialized models like DeepSeek (v3, v3.1, R1), Qwen 3, and others for cost-optimized inference. The SDK handles streaming responses, tool calling, and provider failover automatically.

The real power comes from Scira’s 28 specialized tools, each designed for a specific retrieval task. For web search, it orchestrates multiple providers in parallel: Exa AI for semantic search, Firecrawl for content extraction, Tavily for real-time data, and Parallel for Reddit queries. The system performs multi-query parallel web search with deduplication across these providers, ensuring comprehensive coverage while avoiding duplicate sources in the final answer.

The system’s 17 search modes are purpose-built for different research contexts. The Extreme mode is particularly sophisticated: it runs a multi-step LLM-driven research agent that plans its own search strategy, executes code for data analysis in sandboxed Daytona environments, and stores research artifacts in R2 storage for later retrieval. The Stocks mode combines real-time OHLC data from Valyu with news from Tavily and Exa, generating interactive charts alongside fundamental analysis. The Academic mode routes queries to scholarly sources using Exa and Firecrawl, while the XQL mode (Pro) implements an advanced query language for precise X/Twitter filtering.

Scira uses Upstash Redis for serverless data management and rate limiting. Pro features appear to add Supermemory for persistent cross-session memory, allowing the agent to remember previous conversations and build context over time. The Lookouts feature schedules recurring research agents that monitor topics, track changes, and email updates—essentially turning Scira into a personalized research assistant that works continuously.

Every response includes inline citations linking back to source material. Unlike black-box AI search tools that expect you to trust their answers, Scira’s citation-first design lets you click through to verify claims independently, making it trivial to audit evidence and catch hallucinations.

Gotcha

The elephant in the room is API dependency. Scira’s comprehensive tool ecosystem requires subscriptions to Exa, Firecrawl, Tavily, Parallel, Supadata, Valyu, CoinGecko, OpenWeatherMap, Google Maps, TMDB, Spotify, and potentially other services depending on which modes you want to use. While the README doesn’t specify exact costs, running the full feature set could require substantial monthly API fees before you account for LLM inference costs. The project doesn’t appear to offer a simplified “starter” configuration with free-tier-only tools, so new users may face an overwhelming API signup process just to get basic functionality working.

Pro features create another barrier. The README indicates that Connectors (for searching Google Drive, Notion, and OneDrive), Memory (for persistent context), Voice (for conversational AI), and XQL (for advanced X analysis) are marked as Pro features. This appears to limit the practical utility of self-hosting, since you may still be paying for certain features on top of infrastructure and API costs. The AGPL-3.0 license means you can fork and modify these restrictions, but you’d need to understand and potentially rebuild certain functionality yourself. For users expecting a truly self-contained open-source solution, this model may be limiting.

Verdict

Use Scira if you’re a power user or enterprise team that already maintains subscriptions to multiple research APIs and wants a unified, customizable interface with multi-provider LLM support. It’s ideal for developers building research workflows that need specialized modes (financial analysis, academic search, social media monitoring) and want the flexibility to swap models or add custom tools. The agentic planning and citation-first design genuinely deliver on the promise of transparent, auditable AI research. Skip it if you want simple deployment without API orchestration, need budget-conscious self-hosting (the API costs appear non-trivial), or just want basic AI search without configuration overhead—Perplexity’s SaaS may be simpler and possibly cheaper at low volumes. Also consider carefully if you expect completely unrestricted open source: certain features are marked as Pro in the codebase, which may affect the self-hosting value proposition unless you’re willing to fork and extend the project.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/zaidmukaddam-scira.svg)](https://starlog.is/api/badge-click/developer-tools/zaidmukaddam-scira)