Back to Articles

nanobot: Building a Production AI Assistant in 1% of the Code

[ View on GitHub ]

nanobot: Building a Production AI Assistant in 1% of the Code

Hook

What if you could build a production-ready AI assistant that speaks to Telegram, Slack, Discord, and WeChat—all in fewer lines of code than a typical Express.js backend? nanobot claims to deliver OpenClaw’s core functionality with 99% fewer lines of code, and they’ve even given you a bash script to verify the claim yourself.

Context

The AI agent framework landscape has become bloated. LangChain weighs in with hundreds of abstractions, AutoGPT ships with autonomous task planning you might never use, and OpenClaw—despite being powerful—carries enough code to make customization feel like archaeology. For developers who want to understand their entire stack or ship a lightweight assistant without dragging in a framework larger than their application, this complexity tax is painful.

nanobot emerged from HKUDS as a radical counter-proposal: what if we stripped an AI assistant down to its essential primitives? Inspired by OpenClaw, it’s designed for developers who value transparency over abstraction density. The timing is particularly relevant given the project’s recent response to a supply chain attack—when litellm was affected by supply chain poisoning, the nanobot team fully removed the dependency and replaced it with native SDKs, shipping v0.1.4.post6 on March 27, 2026. That kind of surgical response is only possible when you actually understand every layer of your stack.

Technical Insight

nanobot’s architecture revolves around three clean abstractions: channels, providers, and an agent runner. Channels handle platform-specific messaging (Telegram’s Bot API, Slack’s Events API, Discord’s Gateway), providers wrap LLM APIs (OpenAI, Anthropic, Gemini, Azure OpenAI, VolcEngine, StepFun, Ollama), and the agent runner orchestrates the conversation loop with session-based memory and tool execution. The core insight is that most agent frameworks over-abstract these layers—nanobot keeps them deliberately thin.

Here’s what a basic setup looks like after running the interactive wizard:

# Generated config snippet (providers.yaml)
providers:
  - name: anthropic
    type: anthropic
    api_key: ${ANTHROPIC_API_KEY}
    default_model: claude-3-5-sonnet-20241022
    prompt_caching: true  # Automatic prompt cache optimization

channels:
  - type: telegram
    token: ${TELEGRAM_BOT_TOKEN}
  - type: slack
    token: ${SLACK_BOT_TOKEN}
    app_token: ${SLACK_APP_TOKEN}

The system boots with nanobot start and immediately begins polling all configured channels. When a message arrives, the channel adapter normalizes it into a common format, the agent runner appends it to the session’s token-based memory (not conversation-count-based—this matters for context window management), and streams the response back through the channel’s specific formatting requirements. Slack gets mrkdwn, Telegram gets HTML, Feishu gets CardKit streaming.

The MCP (Model Context Protocol) integration is particularly clever. Instead of building yet another plugin system, nanobot uses Anthropic’s emerging standard for tool discovery. You can point it at any MCP server—whether that’s a file system tool, a web search provider, or a custom business logic endpoint—and the agent automatically discovers available tools through the protocol’s standardized schema. This is the kind of architectural bet that pays off: by delegating tool standardization to an ecosystem-wide protocol, nanobot avoids maintaining its own registry while gaining compatibility with the growing MCP ecosystem.

The litellm removal reveals both a strength and an ongoing challenge. Prior to v0.1.4.post6, nanobot used litellm to abstract over multiple LLM providers with a single interface. When the supply chain poisoning hit, the team had two options: wait for the ecosystem to stabilize or go native. They chose native SDKs—openai and anthropic official clients—and shipped v0.1.4.post6. The codebase is now more secure and more maintainable, but you’ve lost the convenience of litellm’s broad provider support. If you need a niche LLM service that litellm used to wrap, you’re now writing your own provider adapter.

End-to-end streaming is handled with surprising sophistication for such a minimal codebase. The agent runner uses async generators to stream deltas from the LLM provider, coalesces them at word boundaries (to avoid Slack/Telegram rate limits on rapid edits), and pushes them through the channel’s specific streaming mechanism—Telegram draft updates, Slack message edits, Feishu CardKit progressive rendering. The implementation demonstrates that streaming UX doesn’t require framework magic; it requires understanding each platform’s quirks and buffering intelligently.

The token-based memory system deserves special attention. Instead of naively keeping the last N messages (which breaks when users send short messages vs. long documents), nanobot tracks cumulative token usage and truncates history when approaching the model’s context window. This is paired with prompt caching hints for Anthropic models—the system marks stable system prompts and conversation history for caching, which can significantly reduce costs for long-running sessions with rich system prompts.

Gotcha

The Python 3.11+ requirement is non-negotiable—nanobot uses match statements and other modern syntax that won’t backport. If you’re in an enterprise environment stuck on Python 3.9 or 3.10, you’re blocked at the infrastructure level. The rapid development pace is both a feature and a risk: the project saw multiple releases per day throughout March 2026, with architectural changes (litellm removal, agent runner extraction, streaming redesigns) landing in minor versions. Semantic versioning exists, but the frequency suggests the API surface isn’t fully stable. Production deployments should pin exact versions and read release notes carefully before upgrading.

The litellm removal means fewer out-of-box providers than pre-v0.1.4.post6 nanobot. If you need providers beyond the currently supported ones (OpenAI, Anthropic, Azure OpenAI, VolcEngine, StepFun, Ollama, and others mentioned in the release notes), you may need to write a custom provider class. The documentation provides guidance, but it’s still integration work. The security trade-off was correct, but it shifted some complexity back onto users who need provider diversity. The project also lacks some enterprise operational features you’d expect from mature frameworks—observability support includes LangSmith integration (added March 13), but you won’t find built-in distributed tracing or multi-tenancy primitives. You can build these yourself, but the minimalist philosophy means they’re not included.

Verdict

Use nanobot if you’re building a self-hosted AI assistant where you need multi-channel support (especially non-English platforms like WeChat, Feishu, DingTalk, QQ), want to actually understand your entire agent stack, or need to customize core behavior without fighting a framework. It’s ideal for side projects, internal tools, and teams that value code transparency over abstraction density. The MCP support future-proofs your tool integrations, and the minimal footprint means you can audit the entire system in an afternoon. Skip it if you need enterprise-grade API stability (the architecture is still evolving rapidly), require extensive pre-built LLM provider support without custom integration work, can’t upgrade to Python 3.11+, or need comprehensive production observability features out of the box. Also consider your tolerance for supply chain risk management—while the team handled the litellm incident effectively, it demonstrates the reality of modern Python ecosystems. Use this for projects where you can respond quickly to breaking changes and are willing to contribute fixes upstream. For slower-moving enterprise deployments that need API stability, consider waiting for the architecture to stabilize further.

// QUOTABLE

What if you could build a production-ready AI assistant that speaks to Telegram, Slack, Discord, and WeChat—all in fewer lines of code than a typical Express.js backend? nanobot claims to deliver O...

[ Tweet This ]
// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/hkuds-nanobot.svg)](https://starlog.is/api/badge-click/developer-tools/hkuds-nanobot)