Back to Articles

Progressive Disclosure for AI Agents: How pydantic-ai-skills Solves the Token Budget Problem

[ View on GitHub ]

Progressive Disclosure for AI Agents: How pydantic-ai-skills Solves the Token Budget Problem

Hook

Your AI agent doesn’t need to read 50,000 tokens of documentation every time it starts—but most frameworks force it to anyway. What if tools could introduce themselves with a handshake before sharing their life story?

Context

As AI agent frameworks mature, a paradox emerges: the more capable you make an agent, the more expensive it becomes to initialize. Give your agent access to 20 tools with comprehensive documentation, and you’re burning through thousands of tokens before the agent even starts working. Load all skills upfront with detailed instructions, and your context window fills with information the agent might never use. This is the “eager loading problem” that plagues modern agent architectures.

The Agent Skills specification (agentskills.io) emerged as a standardized format for packaging reusable agent capabilities—think of it as a package manager for AI tools. Anthropic adopted it for Claude, defining a structure where skills live as directories containing YAML metadata, markdown instructions, and executable scripts. But there’s a catch: most implementations load everything at once. DougTrajano’s pydantic-ai-skills takes a different approach, implementing progressive disclosure specifically for Pydantic AI agents. Instead of dumping full documentation into the context, it gives agents three meta-tools to discover and load skills on-demand, treating skill acquisition as a deliberate agent action rather than an upfront configuration burden.

Technical Insight

Tools

registers tools

reads SKILL.yaml

+ INSTRUCTIONS.md

returns full instructions

reads resources/*

returns file content

executes scripts/*

returns execution result

initial prompt

provides

Pydantic AI Agent

SkillsToolset

Skills Directory

Filesystem

load_skill

read_skill_resource

run_skill_script

Skill Metadata

names + descriptions

System architecture — auto-generated

The architecture cleverly transforms skill management into a tool-calling workflow. When you initialize a SkillsToolset, it registers three core tools—load_skill, read_skill_resource, and run_skill_script—that the agent calls like any other function. The initial agent prompt receives only lightweight skill names and descriptions (typically 10-50 tokens per skill), while full instructions (often 500+ tokens) remain on disk until explicitly requested.

Here’s how you set it up:

from pydantic_ai import Agent
from pydantic_ai_skills import SkillsToolset

# Initialize with filesystem-based skills
skills_toolset = SkillsToolset(
    skills_dir="./skills",  # Directory containing SKILL.yaml files
    allowed_scripts=True     # Enable script execution
)

agent = Agent(
    "openai:gpt-4",
    toolsets=[skills_toolset]
)

# The agent now has access to three meta-tools
result = await agent.run(
    "Help me analyze this CSV file"
)

Under the hood, each skill lives as a directory with a specific structure:

skills/
├── data-analysis/
│   ├── SKILL.yaml          # Metadata: name, description, version
│   ├── INSTRUCTIONS.md     # Full documentation (loaded on-demand)
│   ├── resources/
│   │   └── example.csv     # Accessible via read_skill_resource
│   └── scripts/
│       └── analyze.py      # Executable via run_skill_script

The SKILL.yaml follows the Agent Skills specification:

name: data-analysis
description: Analyze CSV files and generate statistical summaries
version: 1.0.0
author: your-org
tags:
  - data
  - analytics
resources:
  - name: example.csv
    description: Sample dataset for testing
scripts:
  - name: analyze.py
    description: Generates mean, median, mode for numeric columns
    arguments:
      - name: filepath
        description: Path to CSV file
        required: true

When the agent encounters a task requiring data analysis, it first calls load_skill(“data-analysis”) to retrieve INSTRUCTIONS.md content, then optionally read_skill_resource(“data-analysis”, “example.csv”) if it needs reference material. This two-phase loading means the agent only pays token costs for skills it actually uses.

The package also supports programmatic skills for dynamic scenarios:

from pydantic_ai_skills import Skill, SkillResource, SkillScript
from dataclasses import dataclass

# Define a skill entirely in Python
analysis_skill = Skill(
    name="runtime-analyzer",
    description="Dynamic analysis tool generated at runtime",
    version="1.0.0",
    instructions="Analyze datasets using pandas. Always check for null values first.",
    resources=[
        SkillResource(
            name="schema.json",
            content='{"columns": ["id", "value"]}'
        )
    ],
    scripts=[
        SkillScript(
            name="quick-stats.py",
            description="Fast statistical summary",
            content="import pandas as pd\nprint(pd.read_csv(sys.argv[1]).describe())"
        )
    ]
)

skills_toolset = SkillsToolset(
    skills=[analysis_skill],  # Pass programmatic skills directly
    allowed_scripts=True
)

This hybrid approach—filesystem skills for stable, version-controlled capabilities plus programmatic skills for runtime generation—creates powerful composition patterns. You could, for instance, generate custom analysis skills based on database schemas discovered at startup, or create user-specific skills that adapt to individual preferences.

The security model deserves attention. Path traversal attacks are prevented through validation, and script execution happens in a configurable working directory with explicit argument passing (no shell=True). However, this isn’t containerization—untrusted skills still run in your process space. The run_skill_script tool captures stdout/stderr and returns exit codes, allowing agents to debug failed executions, but you’ll want additional sandboxing for production environments handling user-submitted skills.

Gotcha

The progressive disclosure model assumes your LLM actually makes good decisions about when to load skills. If you’re using a less capable model or haven’t engineered prompts carefully, the agent might skip load_skill entirely and attempt tasks with only the brief description. I’ve seen agents hallucinate skill capabilities based on names alone, never bothering to read the actual instructions. This means prompt engineering becomes critical—you need to train agents to “read the manual” before attempting complex operations.

Script execution is powerful but dangerous. The framework provides basic path safety and working directory isolation, but there’s no resource limiting, no network sandboxing, and no protection against infinite loops or fork bombs. If you’re loading skills from untrusted sources (user uploads, public registries), you absolutely need additional security layers—consider Docker containers, WebAssembly sandboxing, or restricted subprocess environments. The current implementation is perfect for first-party skills you control, risky for anything else. Also worth noting: with only 116 stars, the community skill ecosystem is embryonic. Don’t expect a rich marketplace of pre-built skills like you’d find with LangChain or even Anthropic’s official Agent Skills repository.

Verdict

Use if: You’re building Pydantic AI agents that handle diverse domains where loading all capabilities upfront wastes tokens—think customer service bots with 50+ specialized skills, or research assistants that tap into domain-specific knowledge on-demand. It’s especially valuable when you need Agent Skills specification compatibility (interop with Anthropic’s ecosystem) while staying within Pydantic AI’s architecture, or when you’re implementing multi-tenant systems where different users get different skill sets. The progressive disclosure model shines when token budgets are tight and skills are numerous. Skip if: You’re working with fewer than 5-10 tools (native Pydantic AI tools are simpler), need enterprise-grade sandboxing without adding your own security infrastructure, or want a mature ecosystem of community-contributed skills (LangChain or official Anthropic routes offer more). Also skip if you’re not using Pydantic AI at all—this is framework-specific. For simple agents with static capabilities, the abstraction overhead outweighs the benefits.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/ai-agents/dougtrajano-pydantic-ai-skills.svg)](https://starlog.is/api/badge-click/ai-agents/dougtrajano-pydantic-ai-skills)