Scira: Building a Multi-Provider AI Search Engine with Agentic Tool Orchestration

Hook

While Perplexity AI charges $20/month for cited search, Scira lets you self-host the same capabilities across 50+ models with one crucial difference: you control exactly which APIs foot the bill and which models answer your questions.

Context

The AI search landscape has bifurcated into two camps: polished commercial products like Perplexity AI that charge monthly subscriptions for black-box architectures, and hobbyist projects that barely scratch the surface of what's possible. Scira occupies the underserved middle ground—sophisticated enough for serious research workflows but open enough to modify, self-host, and audit.

The tool emerged from a fundamental tension in AI-powered search: users want Google's breadth with ChatGPT's synthesis, but neither hallucination-prone LLMs nor citation-free answers suffice for decisions that matter. Traditional search engines return links; pure LLMs fabricate plausible nonsense. Scira's architecture addresses this by implementing an agentic workflow where language models don't just answer—they plan research strategies, invoke specialized tools, retrieve grounded information, and synthesize responses with inline citations that let readers verify every claim. Originally named MiniPerplx, the project has evolved into a full-fledged research platform with scheduled monitoring, voice interaction, and knowledge connector integrations that position it beyond simple question-answering into the realm of automated intelligence gathering.

Technical Insight

Scira's architecture centers on the Vercel AI SDK's tool-calling capabilities, which orchestrate a 28-tool arsenal spanning web search (Exa, Tavily), financial data (CoinGecko, Finnhub), academic papers (arXiv), code execution (Daytona sandboxes), and productivity connectors. When a query arrives, the system employs a planning-retrieval-citation pipeline where the LLM first decomposes complex questions into sub-tasks, selects appropriate tools, executes them in parallel where possible, and synthesizes grounded responses.

The multi-model architecture is particularly elegant. Rather than hard-coding a single provider, Scira supports 50+ models across xAI (Grok), OpenAI (GPT-4/GPT-4o), Anthropic (Claude), Google (Gemini), and others. Here's how tool orchestration works in practice:

import { openai } from '@ai-sdk/openai';
import { generateText, tool } from 'ai';
import { z } from 'zod';

const result = await generateText({
  model: openai('gpt-4o'),
  messages: [{ role: 'user', content: 'What are the latest developments in quantum computing?' }],
  tools: {
    searchWeb: tool({
      description: 'Search the web for current information',
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => {
        const response = await fetch('https://api.exa.ai/search', {
          method: 'POST',
          headers: { 'Authorization': `Bearer ${process.env.EXA_API_KEY}` },
          body: JSON.stringify({ query, numResults: 5, useAutoprompt: true })
        });
        return response.json();
      }
    }),
    searchAcademic: tool({
      description: 'Search arXiv for academic papers',
      parameters: z.object({ query: z.string(), maxResults: z.number() }),
      execute: async ({ query, maxResults }) => {
        const arxivQuery = `http://export.arxiv.org/api/query?search_query=all:${encodeURIComponent(query)}&max_results=${maxResults}`;
        const response = await fetch(arxivQuery);
        return parseArxivXML(await response.text());
      }
    })
  },
  maxSteps: 5
});

This code demonstrates the agentic pattern: the model autonomously decides whether quantum computing questions need web search, academic papers, or both. The maxSteps parameter allows multi-turn tool use—the model might first search the web, discover recent IBM announcements, then query arXiv for the underlying research papers, and finally synthesize findings with citations.

The citation mechanism leverages structured outputs to ensure every factual claim links to sources. After tool execution, the system prompts the model to generate responses with inline citation markers that map to a sources array:

const citedResponse = await generateText({
  model: openai('gpt-4o'),
  system: `Generate responses with inline citations [1], [2], etc. Always include a sources array with URLs.`,
  messages: conversationHistory,
  output: z.object({
    answer: z.string(),
    sources: z.array(z.object({
      id: z.number(),
      title: z.string(),
      url: z.string(),
      snippet: z.string()
    }))
  })
});

The 17 specialized search modes (Academic, Code, Financial, Social, News, etc.) pre-configure tool selection and system prompts. When you select "Financial" mode, Scira automatically prioritizes CoinGecko and Finnhub tools while adjusting the system prompt to emphasize numerical accuracy and temporal relevance. The "Code" mode enables execution in Daytona sandboxes—isolated environments where the LLM can write and run Python or JavaScript to verify algorithmic claims or generate visualizations.

Rate limiting uses Upstash Redis with a sliding window approach that tracks both anonymous and authenticated users. Pro features like "Lookouts" implement scheduled jobs via cron that re-run saved queries and email diff reports when new information emerges. The voice interaction mode streams audio to AssemblyAI for transcription, processes queries, then uses Cartesia's text-to-speech API for natural responses—enabling hands-free research workflows.

The connector integrations via Supermemory deserve special attention. By hooking into Google Drive, Notion, and OneDrive, Scira can perform semantic searches across your personal knowledge base using Cohere embeddings. When answering queries, it automatically checks whether relevant information exists in your connected documents before hitting external APIs, reducing costs and surfacing institutional knowledge that public search engines can't access.

Gotcha

The most significant limitation is API dependency sprawl. A fully-featured Scira deployment requires active accounts with Exa, Firecrawl, Tavily, CoinGecko, Finnhub, arXiv, Reddit API, YouTube Data API, and potentially a dozen more services depending on which modes you enable. Each has its own rate limits, pricing tiers, and failure modes. When Exa goes down or you exhaust your Tavily quota mid-research, graceful degradation isn't guaranteed—queries simply fail or return incomplete results. For self-hosters, managing this constellation of API keys and monitoring costs across providers becomes operational overhead that commercial Perplexity users never face.

The AGPL-3.0 license creates commercial friction. Unlike MIT or Apache-licensed alternatives, any modifications to Scira must be open-sourced if you distribute the modified version. This makes it unsuitable for companies wanting to build proprietary research tools on top of its architecture. The Pro features (Voice, Connectors, Memory, XQL query language) are implemented but paywalled in the codebase, meaning self-hosters need to either subscribe to the hosted version or fork and modify the authentication logic—adding development overhead. Additionally, while 50+ models sounds impressive, intelligent model selection based on task type isn't fully automated; users or developers must manually configure which models handle which query types, requiring deep familiarity with each model's strengths.

Verdict

Use if: You're a researcher, developer, or power user who needs citation-backed AI search with full control over model selection and data flow. Scira excels when you want to avoid vendor lock-in, experiment with cutting-edge models like Grok or Claude, or integrate proprietary knowledge sources via connectors. It's ideal for teams comfortable managing infrastructure, API budgets, and who value auditability over turnkey simplicity. The agentic architecture and tool breadth make it genuinely useful for complex research workflows where sequential reasoning and multi-source synthesis matter. Skip if: You need zero-dependency deployment, want a commercially permissive license for proprietary modifications, or prefer operational simplicity over flexibility. The API cost complexity and AGPL restrictions make it poorly suited for enterprises seeking commercial deployment without open-sourcing changes. If you just want quick web search summaries without managing fifteen API keys, stick with commercial Perplexity or a simpler MIT-licensed alternative like Morphic.

Scira: Building a Multi-Provider AI Search Engine with Agentic Tool Orchestration

Scira: Building a Multi-Provider AI Search Engine with Agentic Tool Orchestration

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

Scira: Building a Multi-Provider AI Search Engine with Agentic Tool Orchestration

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Pi: A Coding Agent Toolkit That Treats Your Sessions as Training Data

Open Notebook: Building a Self-Hosted NotebookLM Clone with Multi-Provider AI

Open Interpreter: Running GPT-4 with Root Access to Your Machine

The Indie Hacker's AI Arbitrage Kit: Inside 50+ Generative SaaS Templates That Treat Code as Commodity

Pi: A Coding Agent Toolkit That Treats Your Sessions as Training Data

Open Notebook: Building a Self-Hosted NotebookLM Clone with Multi-Provider AI

Open Interpreter: Running GPT-4 with Root Access to Your Machine

// CODEBASE INTELLIGENCE

Best for

Skip when