LLM Engineering
139 articles
LLM Engineering
Harness-1: Training Search Agents with State Externalization
LLM Engineering
SichGate Methodology: When Healthcare CISOs Need to Red-Team 4-Bit Llama Without Hiring Offensive Security
LLM Engineering
ModelRegression: Building a Daily LLM Benchmark That Tests What Developers Actually Use
LLM Engineering
Inside AI Product Bench: Why Two LLMs Disagree on Half Their Product Recommendations
LLM Engineering
Neuromod-LLM: Treating Language Models Like Brains on Drugs
LLM Engineering
makemore: Understanding Language Models by Implementing Them Seven Different Ways
LLM Engineering
FastLLM: How a Single Line of Code Can Sabotage AI Reasoning While Improving Benchmarks
LLM Engineering
whichllm: Hardware-Aware LLM Selection Using Evidence-Graded Benchmarks
LLM Engineering
JARVIS: The LLM-Orchestrated AI System That Pioneered Multi-Model Task Automation
LLM Engineering
Guardrails AI: Building Fail-Safe Layers Around Unpredictable LLMs
LLM Engineering
MosaicML Composer: The PyTorch Training Framework That Makes Checkpoints Hardware-Agnostic
LLM Engineering
Building a Codebase Documentation Engine with LLMs: Lessons from auto_llm_codebase_analysis
LLM Engineering
How NYU Built a Leaderboard to Track LLM Agents Hacking Their Way Through CTF Challenges
LLM Engineering
Terminal-Bench: Why Evaluating LLM Agents on Real Command-Line Tasks Is Harder Than You Think
LLM Engineering
Building Threat Models in Draw.io: How XML Libraries Turn Free Diagrams into Security Tools
LLM Engineering
AWS Lambda Layers: How a GitHub List Solved Serverless Dependency Hell
LLM Engineering
Building API Documentation from Network Traffic: Inside Postman's Observability CLI
LLM Engineering
Gepetto: Teaching IDA Pro to Think with Language Models
LLM Engineering
Inside Hugging Face Tokenizers: How Rust Powers Sub-Second NLP Preprocessing at Scale
LLM Engineering
How Open-Assistant Built a ChatGPT Alternative with 160,000 Crowdsourced Conversations
LLM Engineering
Stanford Alpaca: The $500 Experiment That Democratized LLM Fine-Tuning
LLM Engineering
OpenAI Evals: Building a Declarative Framework for LLM Benchmarking
LLM Engineering
Dalai: The Time Capsule That Democratized Local LLMs
LLM Engineering