Dash: A Self-Learning Data Agent That Remembers Its Mistakes

Hook

Most text-to-SQL agents fail the same way twice. Dash remembers every mistake it makes, automatically captures the fix, and never repeats that error again.

Context

Text-to-SQL is the poster child for how LLMs fail in production. You demo it to stakeholders, it converts "show me Q3 revenue by region" into perfect SQL, everyone applauds. Then you ship it, and within days users report that it confuses fiscal quarters with calendar quarters, joins the wrong tables, or generates queries that timeout. You add examples to your prompt. You build a RAG pipeline over past queries. You fine-tune. But every fix is manual, context windows fill up, and the same classes of errors keep appearing with slight variations.

The core problem is that traditional text-to-SQL systems treat context as static. They embed table schemas, maybe add some documentation, perhaps retrieve similar past queries—but they don't fundamentally learn from their operational mistakes. When an agent generates invalid SQL or produces wrong results, a human fixes it, and that knowledge evaporates. The agent has no institutional memory. Dash, built by the team behind AgentOS, takes a systems engineering approach: treat learning as a first-class architectural concern, structure context in layers from static to dynamic, and use specialized agents with clear boundaries rather than a single god-prompt trying to do everything.

Technical Insight

Dash's architecture revolves around what the authors call "six-layer context grounding"—a hierarchy that moves from static knowledge to dynamic runtime learnings. At the foundation (layers 1-2) are table schemas and business annotations: metadata about what each column means, how tables relate, and business rules like "revenue should always be calculated excluding refunds." Layer 3 contains validated query patterns—essentially a curated library of known-good SQL that solves common questions. These three layers are manually seeded, which requires upfront investment but provides the foundation.

Where it gets interesting are layers 4-6: institutional knowledge, error learnings, and runtime introspection. Layer 4 captures tribal knowledge that doesn't fit in schemas—things like "the orders table has a 48-hour lag" or "use the materialized view for queries spanning more than 30 days." Layer 5 is where the self-learning happens: when the agent generates SQL that fails or produces incorrect results, the error and its fix are automatically captured by Agno's Learning Machine. This isn't just logging—the system extracts patterns ("always use COALESCE when joining to the discounts table because it has nulls") and makes them available as context for future queries. Layer 6 handles runtime introspection: query plans, execution stats, and performance characteristics that help the agent understand why a query is slow and how to rewrite it.

The multi-agent coordinator pattern enforces separation of concerns. Here's how the Analyst agent is structured:

from agno import Agent, RunResponse
from dash.tools.analyst import (
    read_table_schemas,
    query_knowledge_base,
    execute_readonly_sql,
    explain_query_plan
)

analyst = Agent(
    name="DataAnalyst",
    role="Read-only data analyst",
    tools=[read_table_schemas, query_knowledge_base, 
           execute_readonly_sql, explain_query_plan],
    instructions="""
    You answer questions by querying the public schema.
    You have read-only access. You cannot create or modify data.
    
    Process:
    1. Read relevant table schemas
    2. Query knowledge base for similar past queries and learnings
    3. Generate SQL grounded in context from all six layers
    4. Execute and return results
    5. If error occurs, explain what went wrong using learnings
    """,
    markdown=True
)

# Coordinator decides which agent handles the request
def route_request(user_query: str, context: dict) -> RunResponse:
    if requires_new_view(user_query):
        return engineer.run(user_query, context=context)
    else:
        return analyst.run(user_query, context=context)

The Analyst only has read-only tools and operates on the public schema. The Engineer agent has a different toolset that can create views and manage the dash schema:

engineer = Agent(
    name="DataEngineer",
    role="Manages derived views and summaries",
    tools=[read_table_schemas, create_dash_view, 
           query_knowledge_base, validate_view_query],
    instructions="""
    You create reusable data assets in the dash schema.
    You can create views, materialized views, and summary tables.
    
    Before creating:
    1. Check if similar view already exists in knowledge base
    2. Validate query against public schema (read-only)
    3. Create in dash schema with clear naming
    4. Document purpose and refresh cadence
    """
)

This dual-schema architecture is enforced at the database level, not just in prompts. The public schema contains immutable company data and the application's database connection has SELECT-only grants. The dash schema is where the Engineer creates derived assets, and it has full write permissions but is completely isolated from source data. This structural boundary means that even if an agent hallucinates a DROP TABLE command, it physically cannot damage company data.

The self-learning loop is the secret sauce. When the Analyst generates SQL that throws an error or a user marks results as incorrect, the system doesn't just log it—it actively learns:

# Simplified learning loop from the Agno Learning Machine
def capture_error_learning(query: str, error: Exception, 
                          context: dict, fix: str = None):
    learning = {
        "query_intent": context.get("user_question"),
        "generated_sql": query,
        "error_type": type(error).__name__,
        "error_message": str(error),
        "context_layers_used": context.get("layers"),
        "fix": fix,  # Can be human-provided or auto-generated
        "timestamp": datetime.utcnow()
    }
    
    # Extract pattern
    pattern = extract_error_pattern(learning)
    
    # Store in knowledge base (layer 5)
    knowledge_base.add_learning(pattern)
    
    # Make available for future queries
    return pattern

The next time a similar query comes in, layer 5 context automatically includes "When querying revenue by region, remember to filter out test accounts using WHERE account_type != 'test'"—the exact mistake that caused an error last week. No human intervention required.

The security model is worth noting because it diverges from typical prompt-based safety approaches. Rather than telling the LLM "you must not modify data" in the instructions, Dash enforces security at three levels: database grants (SELECT-only on public schema), tool-level constraints (the Analyst literally doesn't have tools that can write), and JWT-based RBAC that validates user permissions before any agent execution. This defense-in-depth approach treats LLM non-determinism as a given rather than hoping prompt engineering is sufficient.

Gotcha

The biggest gotcha is the cold-start problem. Those six layers of context are incredibly powerful when populated, but layers 1-3 require manual curation. You need to annotate your tables with business meaning, document your business rules, and seed the knowledge base with validated query patterns. For a team with a 50-table data warehouse, this could easily be a week of work before you see value. The system doesn't magically understand that "GMV" in your company means gross merchandise value excluding returns, or that the user_status field has an undocumented special value of 99 that means "deleted." You have to teach it.

The tight coupling to AgentOS is another limitation that might surprise teams expecting a standalone tool. While the core Dash logic is open source, production deployment assumes you're running on AgentOS with its JWT authentication, tool orchestration, and Learning Machine infrastructure. If you're committed to LangChain, want to deploy on AWS Bedrock's agent runtime, or need to run fully air-gapped, you'll be rewriting significant portions. The architecture is sound, but extracting it from the AgentOS ecosystem requires effort.

Finally, SQL-focused architecture means this is purpose-built for relational data warehouses. If your data is primarily in Elasticsearch, MongoDB, or behind REST APIs, Dash won't help much without substantial modification. The six-layer context model assumes you're generating SQL against schemas, not crafting API calls or building aggregation pipelines.

Verdict

Use if: You have a complex data warehouse (25+ tables) where business analysts repeatedly ask similar-but-slightly-different questions, you're experiencing tribal knowledge loss as team members leave, you need audit trails and safety guarantees for agent-generated queries, and you can invest 1-2 weeks upfront to curate business context and annotations. The self-learning loop and six-layer grounding will compound value over months. Skip if: You need a quick prototype for a hackathon, have simple data needs (under 10 tables), require cloud-agnostic deployment without platform dependencies, or lack the time to build the initial knowledge base—simpler text-to-SQL tools like Vanna.ai will get you 80% of the value with 20% of the setup cost.

Dash: A Self-Learning Data Agent That Remembers Its Mistakes

Dash: A Self-Learning Data Agent That Remembers Its Mistakes

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Dash: A Self-Learning Data Agent That Remembers Its Mistakes

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

LobeHub: The Agent Orchestration Platform That Treats AI as Your Employee, Not Your Chatbot

OpenSRE: Building the SWE-bench for Production Incidents

Agent Orchestrator: Git Worktrees Are the Secret to Parallel AI Coding

OpenSandbox: Building Production-Grade Isolation for AI Agents That Actually Execute Code

LobeHub: The Agent Orchestration Platform That Treats AI as Your Employee, Not Your Chatbot

OpenSRE: Building the SWE-bench for Production Incidents

Agent Orchestrator: Git Worktrees Are the Secret to Parallel AI Coding

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]