February 12, 2025 · updated May 8, 2026 · 3 min read

LLMflation is good news for everyone except AI-health startups.

Picture an Epic product manager opening the early-2025 inference-pricing chart on a Monday morning. Inference cost has dropped 10x year over year. The native-feature roadmap items that were uneconomic at $10/million-token are operational at $1/million-token. The PM rewrites the 2025-2026 product calendar by Wednesday. By Friday the engineering org has staffing requests for native ambient-scribes, native order-entry-AI, native prior-auth-AI, and native clinical-decision-support modules — every category that an AI-health startup is currently selling into Epic-deployed health systems.

That's the LLMflation thesis playing out from the EHR vendor's seat. a16z named "LLMflation" in November 2024. The thesis: inference cost is dropping 10x per year, the cost of running AI features approaches zero, builders should ship aggressively because the underlying cost structure will support it.

The trade press read that piece as good news for AI-startups across every category. It is not good news for AI-health startups specifically. _It is the accelerant that lets Epic and Oracle Health move faster than AI-health startups, not the other way around._

What's the load-bearing condition? Healthcare is a category where workflow integration is the buyer's gating constraint. The clinician-buyer cannot deploy a clinical-AI feature unless it integrates into the EHR workflow, the EHR vendor's authentication apparatus, the EHR-mediated billing surface, and the EHR-vendor-managed compliance documentation. The buyer's procurement gravity is the EHR. Every clinical-AI feature that ships into a hospital flows through the EHR's procurement decision.

In a model-cost-decline environment, the moat that compresses fastest is the model-quality moat. The moat that compresses slowest is the workflow-integration moat. The category-leader that holds the workflow-integration moat captures the rents from the model-cost decline; the category-leader that holds the model-quality moat watches the rents flow upstream to whichever vendor controls the integration. In healthcare, that's the EHR vendor. Specifically: Epic, Oracle Health, Athenahealth, MEDITECH, plus the long tail of specialty EHRs.

What LLMflation does for the EHR vendor is enable rapid native-feature development at low marginal cost. Epic could not afford to ship native ambient-scribes against $10/million-token inference economics in 2023. By mid-2025, with inference at $1/million tokens, the native-feature roadmap accelerates by months per category. The features Epic ships in 2025-2026 are features the AI-health startup category was depending on Epic _not_ shipping yet. The startup's product-roadmap was calibrated to the assumption that Epic would take three to four years to bundle. Epic, with cheaper inference, can do it in eighteen to twenty-four months. The bundling-clock compresses. The startup's exit window compresses with it.

The AI-health startup category is, in late-2024 and 2025, structurally short the LLMflation curve. Every dollar of inference-cost decline strengthens the EHR-vendor's bundling position. The startup that priced its 2023-2024 strategic plan against a slower bundling curve is the startup whose 2026 exit options are weaker than projected. Operators who recognize the curve and run the exit process actively in 2025 capture the high multiple. Operators who hold out for category-leader-pricing are the operators bypassed when Epic ships native.

The categories that escape the bundling-clock are the ones that don't run through the EHR. Direct-to-consumer (GLP-1 platforms, the wearables stack, OpenEvidence-class direct-to-clinician). Direct-to-payer (some claims-AI, some prior-auth-AI). Pharma-side (the TechBio category). Each of those operates outside the EHR's procurement gravity, which is why each of those is a less-exposed category to LLMflation's accelerant effect. The healthcare-AI startup that wants to escape the bundling-clock has to build outside the EHR.

The LLMflation thesis itself is misframed for healthcare. The trade-press read is "AI-builders win because cost falls." The durable read in healthcare is "AI-builders without workflow-integration moats lose because the EHR vendors win." Both can be true at the category level. The startup-class is the one that absorbs the cost of the structural shift. The EHR-class captures the rents.

The same dynamic recurs across regulated categories. Finance-AI runs into it with the legacy core-banking systems (Fiserv, FIS) playing the EHR-vendor role. Government-AI runs into it with the legacy ERP and case-management systems. Insurance-AI runs into it with the legacy policy-administration systems. In every regulated category, LLMflation accelerates the integrated-incumbent's bundling cycle and compresses the standalone-startup's window. Healthcare is the cleanest case. Finance and government are next. Insurance is on the same curve.

What AI-health founders should ask is whether their product depends on the EHR letting them integrate, or whether the product reaches the buyer outside the EHR's gravity. If the answer is the former, the LLMflation thesis is bad news. If the latter, the thesis is the trade-press version of correct.

The trade-press will, of course, write the LLMflation thesis as universally good news for builders for another year. The structural shape will surface in 2026 when the AI-health startup exits start landing at lower multiples than the early-stage-investor prospectus modeled. By 2027 the structural shape is visible to the late-stage-investor class. By 2028 it is the consensus read.

LLMflation is good news for everyone except AI-health startups. The clock the startups are running against is, in operating terms, accelerated by the same cost-decline curve they were betting on.

—TJ