Back to Articles

Progressive Disclosure: How pydantic-ai-skills Minimizes Token Waste in Agent Tool Systems

[ View on GitHub ]

Progressive Disclosure: How pydantic-ai-skills Minimizes Token Waste in Agent Tool Systems

Hook

Most agent frameworks load all tool documentation into every LLM call, burning tokens on irrelevant context. What if agents could discover capabilities first, then load instructions only when needed?

Context

As AI agents grow more capable, they accumulate dozens or hundreds of specialized tools—API integrations, data processors, domain-specific utilities. Traditional approaches dump all tool descriptions, parameters, and usage instructions into the system prompt, creating a token economics nightmare. A research assistant with many tools might waste substantial tokens on irrelevant documentation when the user has a specific task in mind.

pydantic-ai-skills tackles this through progressive disclosure: agents see lightweight skill summaries (name and one-line description), then explicitly load full instructions only for relevant skills. Built on Anthropic’s open Agent Skills specification, it provides a standardized format for packaging tools as filesystem directories or Python definitions. The framework integrates with Pydantic AI’s architecture, offering both a modern Capabilities API integration (version 1.71+) and a Toolset-based approach for earlier versions.

Technical Insight

The architecture centers on a three-tier discovery pattern. First, agents receive minimal skill metadata—just names and brief descriptions in their system prompt. Second, when a skill becomes relevant, the agent calls load_skill(skill_name) to retrieve full markdown instructions and usage examples. Third, the agent can read additional resources with read_skill_resource() or execute scripts via run_skill_script(). This lazy-loading approach dramatically reduces baseline token consumption.

For Pydantic AI 1.71+, integration uses the Capabilities API, which bundles tools and instructions into a cohesive unit:

from pydantic_ai import Agent
from pydantic_ai_skills import SkillsCapability

agent = Agent(
    model='openai:gpt-4o',
    instructions='You are a helpful research assistant.',
    capabilities=[SkillsCapability(directories=['./skills'])]
)

# The agent now has access to all skills in ./skills directory
# Skills are loaded progressively as the agent determines they're needed
result = await agent.run(
    "What are the latest machine learning papers on arXiv?"
)

Under the hood, SkillsCapability scans the filesystem for skill packages—directories containing a skill.json manifest and an instructions.md file. The manifest defines metadata like skill name, description, supported models, and required dependencies. When the agent sees “arXiv” in the user query, it recognizes the arxiv-search skill from the summary list and calls load_skill('arxiv-search') to retrieve the full instructions markdown.

For earlier Pydantic AI versions, the SkillsToolset approach requires manual instruction injection but offers more control:

from pydantic_ai import Agent, RunContext
from pydantic_ai_skills import SkillsToolset

skills_toolset = SkillsToolset(directories=["./skills", "./custom_skills"])

agent = Agent(
    model='openai:gpt-4o',
    instructions='You are a helpful research assistant.',
    toolsets=[skills_toolset]
)

@agent.instructions
async def add_skills(ctx: RunContext) -> str | None:
    return await skills_toolset.get_instructions(ctx)

The framework supports both filesystem-based and programmatic skill definitions. Filesystem skills are directories with standardized structure containing instructions, scripts, tools, and resources. Programmatic skills use Python decorators or dataclasses, enabling dynamic skill generation based on runtime conditions or API introspection.

Security considerations are baked into the design. Path traversal attacks are prevented through strict validation, and script execution uses subprocess isolation. The framework validates that requested scripts exist within the skill’s designated scripts directory before execution. Type safety comes from Python dataclasses with validation, ensuring skill definitions match the Agent Skills specification schema.

The progressive disclosure pattern shines in multi-skill scenarios. An agent with many skills might only load instructions for a few relevant ones per conversation, potentially saving thousands of tokens compared to loading everything upfront. For teams running substantial agent workloads, this compounds into meaningful cost savings and faster response times from reduced context windows.

Gotcha

The framework’s tight coupling to Pydantic AI is both its strength and limitation. You cannot use these skills with LangChain, AutoGen, Semantic Kernel, or other agent frameworks without substantial adaptation work—the tool-calling mechanisms and integration patterns are Pydantic AI-specific. While the underlying Agent Skills specification is framework-agnostic, this particular implementation doesn’t provide adapters for other ecosystems.

Script execution introduces operational complexity. The subprocess-based approach for running skill scripts means you need proper environment setup, dependency management, and error handling for external scripts. In containerized deployments, you’ll need to ensure script interpreters (Python, Bash, Node.js) are available in the runtime environment. The security mechanisms help, but running agent-triggered code execution requires careful consideration of sandboxing and resource limits. Additionally, with only 207 GitHub stars, the ecosystem of pre-built, community-vetted skills is nascent compared to established tool marketplaces. You’ll likely need to author most skills yourself or adapt existing tools into the Agent Skills format.

Verdict

Use pydantic-ai-skills if you’re building production Pydantic AI agents where token efficiency matters and you need modular, reusable capabilities. It’s particularly valuable for complex agents juggling multiple specialized skills (research, data analysis, API integrations) where loading everything upfront becomes expensive. The progressive disclosure pattern delivers real cost savings at scale, and adopting the Agent Skills specification positions you for future interoperability as the standard gains traction. Teams wanting standardized skill packaging with type safety and security guardrails will appreciate the structured approach. Skip it if you’re using a different agent framework, building simple single-purpose agents where loading a few tools upfront is negligible, or need a mature ecosystem with hundreds of ready-made skills—you’ll find richer tool libraries in LangChain or OpenAI’s function calling ecosystem. The early-stage nature means you’re betting on both Pydantic AI and the Agent Skills specification gaining adoption, which may or may not materialize.

// QUOTABLE

Most agent frameworks load all tool documentation into every LLM call, burning tokens on irrelevant context. What if agents could discover capabilities first, then load instructions only when needed?

[ Tweet This ]
// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/dougtrajano-pydantic-ai-skills.svg)](https://starlog.is/api/badge-click/developer-tools/dougtrajano-pydantic-ai-skills)