Kor: When Your LLM Doesn’t Have Function Calling (And Why You Probably Don’t Need It)
Hook
The GitHub README for Kor opens with a warning telling you not to use it. That’s either refreshingly honest or a massive red flag—turns out, it’s both.
Context
In 2023, the LLM landscape split into two eras: before and after function calling. Modern chat models like GPT-4, Claude, and Gemini can now return structured JSON through native tool calling APIs, making extraction tasks feel almost trivial. But what about the dozens of older models still in production? What about open-source LLMs that lack these features? What about teams locked into legacy systems or air-gapped environments?
Kor emerged as a bridge solution for this exact problem. Integrated with the LangChain framework, it takes a fundamentally different approach: instead of relying on model-native features, it generates elaborate prompts containing schema definitions and examples, sends them to any capable LLM, and parses the structured output from raw text. It’s prompt engineering weaponized for data extraction—a technique that works with virtually any language model, not just the latest flagship releases. The tradeoff? Performance, reliability, and token costs all suffer compared to native tool calling. Kor itself is self-described as a ‘half-baked prototype,’ and its documentation actively steers users toward modern alternatives. Yet it persists with approximately 1,690 GitHub stars, serving a specific niche where compatibility trumps cutting-edge features.
Technical Insight
Kor’s architecture centers on a schema-driven prompt generation system. You define what you want to extract using either Kor’s custom DSL (Object, Text, Number, etc.) or standard Pydantic models, and the library constructs a detailed prompt that includes schema documentation, field descriptions, and critically—examples. These examples aren’t just nice-to-haves; they’re the primary quality lever in Kor’s extraction pipeline.
Here’s a complete extraction chain using Kor’s native schema syntax:
from langchain.chat_models import ChatOpenAI
from kor import create_extraction_chain, Object, Text
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
schema = Object(
id="player",
description="User is controlling a music player to select songs, pause or start them",
attributes=[
Text(
id="song",
description="User wants to play this song",
many=True,
),
Text(
id="artist",
description="Music by the given artist",
examples=[("Songs by paul simon", "paul simon")],
many=True,
),
Text(
id="action",
description="Action to take: play, stop, next, previous",
examples=[
("Please stop the music", "stop"),
("play something", "play"),
("next song", "next"),
],
),
],
)
chain = create_extraction_chain(llm, schema, encoder_or_encoder_class='json')
result = chain.invoke("play songs by paul simon and led zeppelin")
print(result['data'])
# Output: {'player': {'artist': ['paul simon', 'led zeppelin']}}
Under the hood, Kor transforms this schema into a verbose prompt that instructs the LLM on extraction expectations, provides format specifications, and embeds the examples as few-shot demonstrations. The LLM generates a response, and Kor’s parser extracts structured data from the output—no function calling required.
For teams already invested in Pydantic for validation, Kor supports that too. You can define schemas using standard Pydantic models with Field descriptions and examples, gaining type safety and validation hooks:
import enum
from typing import Optional, List
from pydantic import BaseModel, Field
class Action(enum.Enum):
play = "play"
stop = "stop"
next_ = "next"
class MusicRequest(BaseModel):
song: Optional[List[str]] = Field(
default=None,
description="The song(s) that the user would like to be played."
)
artist: Optional[List[str]] = Field(
default=None,
description="The artist(s) whose music the user would like to hear.",
examples=[("Songs by paul simon", "paul simon")],
)
action: Optional[Action] = Field(
default=None,
description="Action to take"
)
This Pydantic-first approach is particularly valuable for version 1.0.0 of Kor, which achieved compatibility with both Pydantic v1 and v2—a non-trivial feat given the breaking changes between versions. The library handles validation through optional Pydantic validators, catching malformed extractions before they propagate through your application.
Kor integrates tightly with LangChain, which means extraction chains can be composed with other LangChain components—retrievers, memory modules, agents. For developers already building in the LangChain ecosystem, Kor slots in naturally.
The extraction quality hinges almost entirely on example quality and model capability. Unlike function calling, where the model has been fine-tuned to follow structured output instructions, prompt-based extraction relies on the LLM’s ability to pattern-match from examples and follow natural language instructions. This means you’ll burn more tokens on longer prompts, experience slower response times, and encounter higher error rates—especially with complex nested schemas or ambiguous inputs.
Gotcha
The biggest gotcha is right there in the README: ‘If you’re using a chat model that supports a tool calling API, you should probably be using the chat models’ tool calling API instead of Kor!’ This isn’t false modesty—it’s a technical reality. Kor’s prompt-based approach is fundamentally slower and less reliable than native function calling. Every extraction requires sending a verbose prompt filled with schema definitions and examples, which consumes more tokens, costs more money, and takes longer to process.
Kor is also explicitly labeled a ‘half-baked prototype’ by its creator. API stability isn’t guaranteed, production use comes with warnings, and the documentation acknowledges the system ‘makes mistakes’ and ‘is slow.’ You’re not building on a polished, battle-tested foundation—you’re adopting a workaround that exists because better options weren’t available when it was created. For modern LLM applications, those better options now exist.
Context length limitations hit harder with Kor than with function calling. Because examples and schema definitions inflate prompt size, you’ll run into token limits faster when processing long documents or using many examples. The library doesn’t magically solve the extraction problem; it shifts complexity into prompt engineering, which means debugging becomes a matter of tweaking examples and descriptions rather than adjusting API parameters. If you’re accustomed to the reliability of structured outputs from function calling, Kor’s probabilistic nature will feel like a step backward.
Verdict
Use if: You’re stuck with legacy LLMs that genuinely lack function calling support (older open-source models, proprietary internal models, air-gapped systems), you need compatibility with non-chat-interface LLMs, or you’re researching prompt-based extraction techniques for academic or experimental purposes. Kor provides a legitimate solution when native structured output APIs aren’t available. Skip if: You have access to any modern chat model with native tool calling support. The project’s own maintainer recommends using native tool calling instead, and that guidance is correct. You’ll get faster performance, better reliability, and lower token costs with models that support function/tool calling APIs. Kor solves a problem that, for most developers working with modern chat models, no longer exists. Treat it as a compatibility shim for legacy systems or older LLMs, not a first-choice extraction framework.