Yakcc: Content-Addressed TypeScript Blocks with Property-Test Integrity
Hook
Most package registries treat test results as external metadata. Yakcc makes property tests intrinsic to block identity by hashing source code and its verification contract together—if the test changes, the dependency reference breaks.
Context
If you've worked on more than three TypeScript microservices, you've copied the same utilities between codebases. That date parser. That Zod validator. That retry-with-exponential-backoff helper. You know you should extract them to a shared library, but the overhead of publishing to npm, versioning, and maintaining yet another package feels worse than the duplication.
Yakcc attacks this problem with content-addressable storage borrowed from Nix and IPFS, but adds two architectural twists: property tests become the integrity primitive (rather than just source hashes), and IDE hooks intercept LLM code generation to suggest existing blocks before burning tokens on regeneration. The result is a local-first registry where functions are stored as atomic 'blocks' identified by hash, searchable via offline embeddings, and verified by executable contracts. It's ambitious infrastructure for what the author honestly calls 'the most ambitious yak shave in the history of engineering'—a meta-build tool that sits above TypeScript's compiler to manage function-level reuse across projects.
Technical Insight
Yakcc's architecture chains three subsystems: a 'shaving' phase that extracts functions into blocks, a SQLite registry that stores them by content hash, and an IDE watcher that intercepts LLM interactions. The shaving process uses TypeScript's compiler API to traverse your AST and extract function declarations, their dependencies, and associated property tests into atomic units.
Here's what a shaved block looks like in the registry:
// Block hash: sha256(source + property tests + dependencies)
// block_a3f29c8e.ts
export function parseISODate(input: string): Date | null {
const match = /^(\d{4})-(\d{2})-(\d{2})$/.exec(input);
if (!match) return null;
const [_, year, month, day] = match;
return new Date(+year, +month - 1, +day);
}
// Embedded property test (part of block identity)
import * as fc from 'fast-check';
export const properties = {
roundTrip: fc.property(
fc.date(),
(d) => {
const iso = d.toISOString().split('T')[0];
const parsed = parseISODate(iso);
return parsed?.getFullYear() === d.getFullYear();
}
),
rejectsInvalid: fc.property(
fc.string(),
(s) => {
fc.pre(!/^\d{4}-\d{2}-\d{2}$/.test(s));
return parseISODate(s) === null;
}
)
};
The hash includes the property tests, so if you modify the behavioral contract, the block gets a new identity. This creates a 'triplet' of source, verification, and address—change any component and downstream consumers know the contract shifted. It's stronger than semantic versioning because breaking changes are cryptographically enforced rather than convention-based.
The IDE hook layer watches filesystem directories where AI coding tools store their configuration: .claude/, .cursor/, .cline/. When it detects activity (a new chat session or code generation request), yakcc performs a semantic search against the local registry using transformers.js embeddings:
// Simplified hook logic
const watcher = chokidar.watch(['.claude/**', '.cursor/**'], {
ignoreInitial: true
});
watcher.on('add', async (path) => {
const intent = await extractIntent(path); // Parse chat context
const embedding = await embed(intent); // Local transformers.js
const matches = registry.similaritySearch(embedding, topK: 5);
if (matches.length > 0) {
// Inject suggestions BEFORE LLM response
await injectBlockSuggestions(path, matches);
}
});
This interception happens before the LLM generates code, so users see existing verified blocks as suggestions. If they accept, yakcc inserts an import reference and adds the block hash to their project's manifest. The filesystem watching is fragile—it breaks if Claude Code changes directory structure—but it's vendor-agnostic since it operates below the API layer.
The federation model uses simple HTTP mirroring. When you reference a block not in your local registry, yakcc fetches it from configured mirrors and verifies integrity by recomputing the hash. No blockchain, no DHT, no consensus protocol—just content-addressed HTTP with cryptographic verification:
// Fetching a remote block
const blockHash = 'a3f29c8e...';
const mirrors = ['https://registry.yakcc.dev', 'https://alt.mirror.io'];
for (const mirror of mirrors) {
const response = await fetch(`${mirror}/blocks/${blockHash}`);
const content = await response.text();
if (sha256(content) === blockHash) {
registry.store(blockHash, content);
break;
}
}
The 'granularity dial' (1-5 setting) controls how aggressively the shaver splits code. At level 1, it extracts only top-level exports. At level 5, it atomizes every pure function, even nested helpers. This exposes the fundamental chunking problem in code reuse: too fine and you get unusable fragments (a three-line helper with more import overhead than value), too coarse and you lose composability (a 500-line utility that does three things). Most systems hide this trade-off; yakcc makes it a first-class configuration parameter.
The self-hosting demo is architecturally significant. Version 2 uses yakcc to shave yakcc itself, storing the registry engine's functions as blocks in its own registry. This proves the abstraction doesn't leak under recursion—a bootstrapping test that catches dependency cycles and registry corruption issues early.
Gotcha
Yakcc is TypeScript-only with no clear polyglot path. The shaving logic is tightly coupled to TypeScript's AST via the compiler API, so Python, Go, or Rust codebases can't participate. If your team uses multiple languages, you'll have fragmented tooling—yakcc for TypeScript services, something else for backend code. This is a fundamental ceiling.
The property test coverage is only as good as what you write. Blocks ship with user-authored fast-check properties, and there's no static analysis to detect under-specified tests. A lazy developer can satisfy the 'must include property tests' requirement with trivial assertions that pass everything. The integrity guarantee is 'this code and these tests are immutably paired,' not 'these tests prove correctness.' You're trusting block authors to write meaningful contracts. The SQLite registry also has concurrency issues—two processes shaving simultaneously risk lock contention or corruption since there's no distributed transaction coordinator. Teams need external coordination (CI-based shaving, designated registry maintainer) to avoid conflicts. Finally, the local embeddings from transformers.js are toy-grade compared to OpenAI's ada-002. Expect poor recall on abstract or mathematical code where syntactic similarity diverges from semantic intent. A block implementing Dijkstra's algorithm might not surface when you ask for 'shortest path finder' because the local model can't bridge that conceptual gap.
Verdict
Use yakcc if you're a solo developer or small TypeScript-focused team drowning in duplicated utilities across microservices, you value offline-first architecture and can't depend on external APIs for semantic search, or you're intrigued by content-addressed code reuse as a research direction and willing to tolerate alpha-stage tooling. Skip if you work in polyglot codebases where TypeScript is one of many languages, you need production-grade semantic search quality and can't accept the limitations of local embeddings, your team does concurrent development and you lack CI infrastructure to coordinate registry writes, or you're already satisfied with Nx/Turborepo workspace libraries and don't need cross-project federation. The 5-star GitHub count and 'ambitious yak shave' description are honest signals—this is experimental infrastructure for early adopters, not enterprise-ready tooling.