Scira: Building a Self-Hosted Perplexity with 28 Tools and Agentic Planning
Hook
While Perplexity raised $500M to build a closed AI search engine, one developer created an open-source alternative that supports dozens of models, 17 specialized search modes, and runs on your own infrastructure.
Context
AI-powered search engines like Perplexity have proven that LLM-augmented web search can deliver better results than traditional keyword matching. But they come with a catch: you’re locked into their model choices, their data sources, and their interpretation of what constitutes a good answer. If you disagree with a citation, want to use a different LLM provider, or need to search specialized sources like academic papers or prediction markets, you’re out of luck.
Scira (formerly MiniPerplx) emerged as an open-source answer to this problem. Built on the Vercel AI SDK and licensed under AGPL-3.0, it’s a Next.js application that gives you the full Perplexity experience—AI-planned research, grounded retrieval, inline citations—but with complete control over your model providers, data sources, and deployment environment. With 11,550 stars on GitHub and backing from Vercel’s OSS program, it’s become the de facto self-hosted alternative for developers who want AI search without vendor lock-in.
Technical Insight
At its core, Scira implements an agentic planning architecture that transforms user queries into multi-step research workflows. When you ask a complex question, the system doesn’t just fire off a single search and call it done. Instead, it breaks your question into sub-tasks, selects the optimal models and tools for each step, executes searches in parallel with deduplication, and synthesizes responses with cross-verified citations.
The architecture leverages the Vercel AI SDK as its orchestration layer, which provides a unified interface to multiple model providers. This includes xAI’s Grok models (including Grok 3, Grok 4, and specialized variants like Grok Code), OpenAI’s GPT 4.1 and GPT 5.x series plus o3 and o4 mini for reasoning, Anthropic’s Claude 4.5 and 4.6 models for analysis, Google’s Gemini 2.5 and 3.x for multimodal tasks, and specialized models like DeepSeek (v3, v3.1, R1), Qwen 3, and others for cost-optimized inference. The SDK handles streaming responses, tool calling, and provider failover automatically.
The real power comes from Scira’s 28 specialized tools, each designed for a specific retrieval task. For web search, it orchestrates multiple providers in parallel: Exa AI for semantic search, Firecrawl for content extraction, Tavily for real-time data, and Parallel for Reddit queries. The system performs multi-query parallel web search with deduplication across these providers, ensuring comprehensive coverage while avoiding duplicate sources in the final answer.
The system’s 17 search modes are purpose-built for different research contexts. The Extreme mode is particularly sophisticated: it runs a multi-step LLM-driven research agent that plans its own search strategy, executes code for data analysis in sandboxed Daytona environments, and stores research artifacts in R2 storage for later retrieval. The Stocks mode combines real-time OHLC data from Valyu with news from Tavily and Exa, generating interactive charts alongside fundamental analysis. The Academic mode routes queries to scholarly sources using Exa and Firecrawl, while the XQL mode (Pro) implements an advanced query language for precise X/Twitter filtering.
Scira uses Upstash Redis for serverless data management and rate limiting. Pro features appear to add Supermemory for persistent cross-session memory, allowing the agent to remember previous conversations and build context over time. The Lookouts feature schedules recurring research agents that monitor topics, track changes, and email updates—essentially turning Scira into a personalized research assistant that works continuously.
Every response includes inline citations linking back to source material. Unlike black-box AI search tools that expect you to trust their answers, Scira’s citation-first design lets you click through to verify claims independently, making it trivial to audit evidence and catch hallucinations.
Gotcha
The elephant in the room is API dependency. Scira’s comprehensive tool ecosystem requires subscriptions to Exa, Firecrawl, Tavily, Parallel, Supadata, Valyu, CoinGecko, OpenWeatherMap, Google Maps, TMDB, Spotify, and potentially other services depending on which modes you want to use. While the README doesn’t specify exact costs, running the full feature set could require substantial monthly API fees before you account for LLM inference costs. The project doesn’t appear to offer a simplified “starter” configuration with free-tier-only tools, so new users may face an overwhelming API signup process just to get basic functionality working.
Pro features create another barrier. The README indicates that Connectors (for searching Google Drive, Notion, and OneDrive), Memory (for persistent context), Voice (for conversational AI), and XQL (for advanced X analysis) are marked as Pro features. This appears to limit the practical utility of self-hosting, since you may still be paying for certain features on top of infrastructure and API costs. The AGPL-3.0 license means you can fork and modify these restrictions, but you’d need to understand and potentially rebuild certain functionality yourself. For users expecting a truly self-contained open-source solution, this model may be limiting.
Verdict
Use Scira if you’re a power user or enterprise team that already maintains subscriptions to multiple research APIs and wants a unified, customizable interface with multi-provider LLM support. It’s ideal for developers building research workflows that need specialized modes (financial analysis, academic search, social media monitoring) and want the flexibility to swap models or add custom tools. The agentic planning and citation-first design genuinely deliver on the promise of transparent, auditable AI research. Skip it if you want simple deployment without API orchestration, need budget-conscious self-hosting (the API costs appear non-trivial), or just want basic AI search without configuration overhead—Perplexity’s SaaS may be simpler and possibly cheaper at low volumes. Also consider carefully if you expect completely unrestricted open source: certain features are marked as Pro in the codebase, which may affect the self-hosting value proposition unless you’re willing to fork and extend the project.