The Library · The AI Runtime

The library

Every issue, in one place

The AI Runtime library is our own writing: weekly issues on Vertical AI Agents, Model Reliability Engineering, and the lessons from the trenches of shipping AI. Free, every week.

Read every issue at theairuntime.com →

From the newsletter

The AI Runtime, issue by issue

Issue The AI Runtime · Jun 15, 2026

The "Self-Improving" AI Myth (And What 60 Production Deployments Actually Do)

Sixty production deployments converged on a three-layer architecture where the eval surface, not the base model, is the moat.

Issue The AI Runtime · Jun 11, 2026

Dario Amodei’s “Policy on the AI Exponential” Describes a World Banking AI Already Lives In

His new essay asks regulators to build an FAA for AI models. If you ship AI inside a bank, you already work in that regime, and there is a five-minute test that tells you whether your agent is as far

Issue The AI Runtime · Jun 8, 2026

Harness Half-Life: A Field Playbook for Catching Agent Decay

The harness engineering discourse names what to build. The Model Reliability Engineering arc names how long the build lasts, what kills it, and what to do at week six.

Issue The AI Runtime · Jun 8, 2026

The AI Eval Gate Cheat Sheet

Most AI projects die in the gap between "works in the demo" and "works in production."

Issue The AI Runtime · Jun 7, 2026

Two Ways to Shrink an AI Model. Only One Keeps the Output.

Quantization changes the numbers. Lossless compression removes the wasted bits and keeps every output identical, for about 30% less memory.

Issue The AI Runtime · Jun 5, 2026

Two Ways to Shrink an AI Model. Only One Keeps the Output.

Quantization changes the numbers. Lossless compression removes the wasted bits and keeps every output identical, for about 30% less memory.

Issue The AI Runtime · Jun 3, 2026

The Anatomy of an AI Legal Agent

The leading AI legal research tools still hallucinate on up to a third of queries, so the production answer in law is not a better model but a harness built to assume the model is wrong.

Issue The AI Runtime · Jun 3, 2026

The Model Is the Smallest Part: A Free Field Guide to Production AI

Sixteen published deep-dives, four modules, one operating thesis. The harness around the model is the product. Free.

Issue The AI Runtime · Jun 1, 2026

Why Every Browser Harness Wrapper Is on Borrowed Time

Six hundred lines of code, no abstractions, and the argument that every wrapper around the LLM is on borrowed time.

Issue The AI Runtime · May 27, 2026

Context Engineering for Code Agents: A Four-Level Spectrum

Context Engineering for code agents is the discipline of deciding what the model knows about a codebase, its conventions, and the organization at inference time.

Issue The AI Runtime · May 25, 2026

The Complete Field Guide to Browser Harnesses in 2026

Thirty-plus harnesses, four topologies, two billion-dollar valuations, one collapsing abstraction layer. The canonical landscape of how autonomous agents drive the web - and the trade-offs that decide

Issue The AI Runtime · May 21, 2026

Agent Commerce Is in Production. Here’s the Stack, the Code, and the Three Things Already Breaking.

Learnings from the first hundred days of MPP and the year-plus of x402: how Parallel, Browserbase, fal.ai, and AWS are actually running it, where the production failure modes are, and the archite

Issue The AI Runtime · May 19, 2026

The Anatomy of a Production Vertical Agent

Seven layers wrap every LLM that has shipped in healthcare, banking, and insurance. The model itself is the smallest of them — here’s what the other six are doing.

Issue The AI Runtime · May 16, 2026

Agents Can’t Sign Up, Demos Can’t Ship: Lessons from The AI Runtime Meetup

Two talks, one diagnosis — the infrastructure layer between AI capability and enterprise production is the bottleneck, and it isn’t being built by the model labs.

Issue The AI Runtime · May 15, 2026

MCP Servers Are the Next Shadow Surface

Tool descriptions are now executable instructions, the dependency graph for agents runs through hundreds of unvetted servers, and the registry your enterprise needs to govern them does not yet exist.

Issue The AI Runtime · May 14, 2026

The Cost-Per-Completed-Task Era

Per-token pricing was the right unit when API calls were single-shot. Is it when your agent runs adaptive thinking, fans out tool calls, spawns sub-agents, and retries on partial failure?

Issue The AI Runtime · May 12, 2026

The Brain Isn’t the LLM: How HockeyStack Built Revenue Agents

HockeyStack just raised $50M to scale a vertical agent platform whose reasoning engine is a custom ML pipeline — not a frontier model. Why that matters for anyone building agents.

Issue The AI Runtime · May 11, 2026

How MIT’s ScienceClaw Runs Hundreds of AI Agents Without a Central Planner

MIT’s open-source agent swarm replaces the orchestrator with an artifact reactor. The architecture is worth studying even if you’ll never build a science swarm.

Issue The AI Runtime · May 10, 2026

The Agent Runtime, Unbundled: A Reference Architecture Built on OpenClaw

OpenClaw isn’t a product to adopt. It’s a reference architecture to decompose. Five primitives, three production-grade use cases that earn real revenue, and a harness audit checklist for anyone build

Issue The AI Runtime · May 9, 2026

Auctor’s Bet: Traceability Is the Architecture, Not a Feature

Enterprise software only creates value when it’s actually deployed, and deployment is overwhelmingly a labor problem, not a software problem.

Questions

Ask us a specific question

Working on something specific? Ask our team directly. We read every question.

Get it in your inbox

One free issue a week on Vertical AI Agents, Model Reliability Engineering, and lessons from the trenches of shipping AI.

Every issue, in one place

The AI Runtime, issue by issue

Ask us a specific question

Get it in your inbox

The AI Runtime newsletter