Kor: Extracting Structured Data When Your LLM Doesn't Have Function Calling

Hook

In 2024, most developers reaching for structured extraction tools are solving the wrong problem—they're using prompt engineering frameworks when their LLM already has native function calling built in.

Context

Before GPT-4's function calling API launched in June 2023, extracting structured data from LLMs was messy. You'd prompt a model to return JSON, cross your fingers, and hope the output was valid. Maybe you'd add "Return your response as JSON" to your prompt. Maybe you'd write brittle regex parsers. Maybe you'd retry three times and give up. The fundamental problem was that language models were trained to generate natural language, not conform to schemas.

Kor emerged as a solution to this chaos by treating structured extraction as a prompt engineering problem. Instead of hoping the LLM would magically return valid JSON, Kor generates carefully crafted prompts that include your schema definition, examples of correct extractions, and clear instructions. It then parses the LLM's response, validates it against your Pydantic models, and handles the inevitable edge cases. This was genuinely useful in 2022. But here's the tension: the README itself now warns you not to use Kor with modern chat models. So why does this library still matter?

Technical Insight

Kor's architecture is built around a deceptively simple idea: if you show an LLM enough examples of the structure you want, it'll learn to replicate it. The library defines two core abstractions—Object for schema definition and from_pydantic for converting existing models—then compiles them into comprehensive prompts.

Here's what a basic extraction looks like:

from kor import create_extraction_chain, Object, Text
from langchain.chat_models import ChatOpenAI

# Define your schema using Kor's native objects
person_schema = Object(
    id="person",
    description="Personal information about an individual",
    attributes=[
        Text(id="name", description="The person's full name"),
        Text(id="email", description="Email address"),
        Text(id="company", description="Company name")
    ],
    examples=[
        ("John Smith works at Acme Corp and can be reached at john@acme.com",
         {"name": "John Smith", "email": "john@acme.com", "company": "Acme Corp"}),
        ("Contact Jane Doe (jane.doe@example.org) from Example Inc",
         {"name": "Jane Doe", "email": "jane.doe@example.org", "company": "Example Inc"})
    ]
)

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
chain = create_extraction_chain(llm, person_schema)

result = chain.invoke("Sarah Johnson from DataCo sent me an email at sarah.j@dataco.io")
print(result["data"])

What happens under the hood is where Kor gets interesting. The library takes your schema and examples, then generates a structured prompt that looks something like:

Your goal is to extract structured information from the user's input.
The extracted information should match the format specified below.

<object id="person">
Personal information about an individual
<attributes>
- name: The person's full name
- email: Email address  
- company: Company name
</attributes>

<examples>
Input: John Smith works at Acme Corp and can be reached at john@acme.com
Output: {"name": "John Smith", "email": "john@acme.com", "company": "Acme Corp"}
...
</examples>
</object>

The prompt includes XML-like tags for structure, verbose descriptions of each field, and—critically—the examples you provided. This example-based learning is what makes Kor work with older models that don't understand function schemas.

Kor also supports Pydantic models directly, which is where it becomes more practical for existing codebases:

from pydantic import BaseModel, Field
from kor import from_pydantic

class Person(BaseModel):
    name: str = Field(description="The person's full name")
    age: int = Field(description="Age in years")
    skills: list[str] = Field(description="List of professional skills")

schema, validator = from_pydantic(Person, examples=[
    ("Alex, 32, knows Python and Rust",
     {"name": "Alex", "age": 32, "skills": ["Python", "Rust"]})
])

chain = create_extraction_chain(llm, schema, validator=validator)

The from_pydantic function performs double duty: it generates the Kor schema from your Pydantic model's field definitions and descriptions, and it returns a validator function that ensures the LLM's output conforms to your types. This validator catches type mismatches, missing required fields, and constraint violations before they reach your application code.

The library handles both Pydantic v1 and v2, which is non-trivial given the breaking changes between versions. It auto-detects which version you're using and adjusts its parsing logic accordingly. This compatibility layer explains some of the code complexity—Kor essentially maintains two parallel validation paths.

One underappreciated feature is nested object extraction. You can define objects within objects, and Kor will generate prompts that guide the LLM through extracting hierarchical data. This is where example quality becomes critical: a complex nested structure with poor examples will confuse older models, but good examples can extract surprising amounts of structured information even from basic completions models.

Gotcha

The performance characteristics are rough. Every extraction includes your full schema, all your examples, and the input text in a single prompt. With five examples and a detailed schema, you're easily looking at 1,000+ tokens before the actual content. This means two things: slower responses and higher costs. The library's own documentation admits this isn't fast, and recommends using larger, more expensive models for better results—which compounds the cost problem.

Context length is your enemy. Long documents don't fit well in Kor's paradigm because you need room for the schema, examples, and input text within the model's context window. The naive solution—truncating the input—often removes the exact information you're trying to extract. Some users chunk documents, but then you lose cross-chunk context. There's no good solution here except using models with massive context windows, which brings us back to the cost problem.

But the fundamental limitation is stated plainly in the README: for modern models with function calling, you shouldn't use Kor at all. GPT-4, Claude 3, and similar models have native structured output capabilities that are faster, more reliable, and better integrated. Kor was designed to work around a limitation that largely no longer exists. This puts the library in an awkward position—it's a specialized tool for legacy systems, maintained primarily for backwards compatibility rather than active development. The 'half-baked prototype' warning in the documentation isn't false modesty; it's an accurate description of the project's scope and stability expectations.

Verdict

Use if: You're stuck with older LLMs that lack function calling (legacy GPT-3 deployments, older open-source models, or cost-constrained scenarios forcing you to use basic completion models), you need structured extraction with strong typing guarantees via Pydantic, or you're researching prompt-based extraction techniques as an alternative to function calling. Skip if: You have access to any modern chat model with native function calling (GPT-4, Claude 3, Gemini Pro), you need production-grade performance and reliability, or you're starting a new project—use LangChain's built-in tool calling, Instructor, or similar modern alternatives instead. Kor solved a real problem that mostly doesn't exist anymore, but for the narrow use cases where you're genuinely constrained to older models, it remains the most mature solution.

Kor: Extracting Structured Data When Your LLM Doesn't Have Function Calling

Kor: Extracting Structured Data When Your LLM Doesn't Have Function Calling

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Kor: Extracting Structured Data When Your LLM Doesn't Have Function Calling

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

ds4: The SSD-Streaming Inference Engine That Treats Your Mac's NVMe Like RAM

Harness-1: Training Search Agents with State Externalization

makemore: Understanding Language Models by Implementing Them Seven Different Ways

JARVIS: The LLM-Orchestrated AI System That Pioneered Multi-Model Task Automation

ds4: The SSD-Streaming Inference Engine That Treats Your Mac's NVMe Like RAM

Harness-1: Training Search Agents with State Externalization

makemore: Understanding Language Models by Implementing Them Seven Different Ways

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]