Skip to content
    Back to writing
    May 20, 2024 · updated May 8, 2026 · 3 min read

    The drug commoditizes. The data doesn't.

    The drug commoditizes. The data doesn't — by Thomas Jankowski, aided by AI
    The drug commoditizes, the data doesn't— TJ x AI

    Semaglutide goes generic in some markets in 2026. Tirzepatide follows in the back half of the decade. The molecules will commoditize, on the timeline that GLP-1 patents are scheduled to expire and that the trade press has been tracking with the same kind of attention-pattern that tracks any blockbuster drug going off-patent: too early to matter, too early to matter, too early to matter, sudden cliff.

    That is not the interesting story. The interesting story is what the GLP-1 platforms have been doing with the eighteen months between mass adoption and patent cliff. Specifically: what they have been collecting.

    A digital-health GLP-1 platform (Ro, Hims, Noom, Calibrate, Found, plus the platform layer inside Walgreens and CVS) is not a pharmacy. It is a longitudinal-data collection apparatus that happens to dispense semaglutide. The patient onboarding asks twenty questions. The weekly check-in asks ten more. The wearable integration pulls weight, sleep, heart-rate variability, step count. The pharmacy data shows which months the prescription got refilled and which months it lapsed. The behavioral-coaching layer (the part the platform markets as "AI coaching") logs every conversation, every refusal-to-respond, every micro-decision the user reports about food and exercise.

    In aggregate, that is one of the most operator-valuable longitudinal datasets in healthcare. It is structured, dated, behaviorally-annotated, biometrically-anchored, and growing. The platforms have ten to fifty million patient-months of it by 2026, depending on which platform. None of the legacy pharmacy benefit managers have anything close. Neither do the molecules' manufacturers.

    The argument that holds is that the molecule is going to commoditize, of course, and the data will not.

    What the data predicts is more useful than what the molecule does. The molecule produces 15-25% body weight reduction across a defined cohort under controlled adherence. The data predicts which patients will lose adherence in month four, which patients will regain weight after month twelve discontinuation, which patients will develop comorbid conditions that the platform's specialty offerings can capture, and (the part the platforms are not announcing yet) which behavioral patterns predict treatment response across the rest of the metabolic-disease category that follows GLP-1.

    That last item is the long-term moat.

    When semaglutide is generic and the next-generation oral GLP-1 ships, the platform that has the longitudinal-adherence-and-response data on five million patients can price its services against any competitor that does not. When the next metabolic intervention arrives (orphan diabetes drugs, obesity-comorbidity panels, the cardiovascular adjuvants) the platform that already knows which subpopulations responded to the GLP-1 and how can target the new intervention with response-prediction accuracy that the manufacturer cannot match. The platform sells the access; the manufacturer sells the molecule. The platform's margin is, of course, the higher one.

    This is the canonical commoditization-vs-moat shape. The molecule is the engine. The data graph is the operating system. Patients pay $300/month at the platform. Some fraction of that pays for the molecule. The rest pays for the data graph that the platform is accumulating and that the patient does not realize is the actual product.

    There is a version of this story where the patient is straightforwardly the customer and the data is straightforwardly a service in service of the patient's outcome. That version is true at the per-patient level. The patient gets weight-loss outcomes that are, by any honest measure, materially better than they would get without the platform. The story is also true at the platform level: the data graph is the platform's strategic asset, the molecule is the platform's commodity input, and the next ten years of the platform's enterprise value is determined by whose data graph compounds fastest in the eighteen months before the patent cliff.

    Both versions are true. The patient and the platform are aligned at the per-patient level and structurally divergent at the strategic level.

    The operator implications:

    For the GLP-1 platforms: the next eighteen months are the data-accumulation window that decides the next decade of the company. Every onboarding decision, every check-in cadence, every wearable integration is an investment in the data graph; every patient who churns is a counterfactual the next-best-platform now has and they don't.

    For the manufacturers (Novo, Lilly): the molecule pricing power compresses to roughly nothing in the post-patent window, exactly as scheduled. The interesting strategic question is whether the manufacturer can become a platform (own the data, not the molecule), and the answer is structurally no, because the manufacturer's commercial model is product-line-extension, and the platform model requires direct-patient-relationship that pharmaceutical regulation has spent fifty years preventing.

    For the PBMs and payers: the negotiating leverage you have over the molecule is permanent. The negotiating leverage you have over the platform is roughly zero, because the platform's value to the patient is the longitudinal relationship, not the dispensed product. The PBM toolset is calibrated for the wrong layer.

    For the policymakers (the regulators, of course, are watching this happen and will act on a six-year delay): the data is going to outlive the patent. The structural question of who gets to use the longitudinal-adherence-and-response data, on what terms, for which population, is the question regulation should be answering in 2026. It will not be answered until 2030.

    The durable read for healthtech founders: the GLP-1 platform shape is the template for the next ten years of pharmaceutical-adjacent business models. Pair a commoditizing molecule with a longitudinal data layer; the molecule pays for patient acquisition, the data graph compounds, and by the time the molecule commoditizes the platform is no longer in the molecule business. It is in the platform business, which is the higher-margin business that the molecule subsidized. The operators who get this in 2024 will be running the category in 2030.

    Of course, the patient does not need to know any of this for the story to be true. The patient gets the weight loss. The platform gets the data graph. Both happen at $300/month.

    —TJ