Human-in-the-loop is the only AI promise that Klarna had to honor.

Klarna spent most of 2024 publicly celebrating the AI-driven cost savings it had captured by replacing customer-service headcount with an AI agent. The number was real. The press cycle was, at the time, friendly. Through Q4 2024 and into 2025 the same Klarna had to quietly reverse course, hire human agents back, and admit that the AI deployment had degraded on quality dimensions the marketing announcements had not measured. The reversal was less covered than the announcement.
The lesson is not that Klarna's AI deployment was bad. The lesson is that "human-in-the-loop" was the only AI promise the deployment actually had to honor, and the way Klarna had honored it was, in practice, the bad version.
Five tells worth knowing for operator-class diligence on any "human-in-the-loop" claim.
One: is the human in a real workflow, or in an escape hatch?A real workflow has a named escalation path, named SLAs on response time, named override authority on what the AI did. An escape hatch is a hidden link in the chat interface that says "speak to a human" and routes to a queue that nobody is staffed to answer. The marketing copy treats them as the same. The operating reality is they are very different.
Two: who decides when the human enters the loop? If the AI decides (the AI flags low-confidence cases for human review), the human-in-the-loop is real but contingent on the AI's calibration. If the customer decides (the customer can request a human), the human is in the loop only when the customer knows to ask, which most do not. If the operator decides (a sampling rate, a percentage of cases routed to humans regardless), the human-in-the-loop is real and audit-able. The third structure is the only one a regulator would accept, and almost no production deployment ships it.
Three: what is the cost of the human's intervention? A human who reviews the AI's output but cannot change it is a rubber stamp. A human who can override is a real loop participant. The cost of the override is the cost of the human's time, the cost of the customer's wait, and the cost of the AI vendor's diluted margin. Most "human-in-the-loop" deployments price the human at the rubber-stamp rate and discover, in production, that the override is necessary at a much higher rate than the deployment economics anticipated.
Four: is the human's intervention being used to retrain the AI? A real human-in-the-loop loop has a feedback path back into the model. The human's override becomes training data. The next generation of the AI does not make the same mistake. An operator deploying without that feedback path has a static AI and a human bandaid, and the bandaid scales with deployment volume in a way the marketing copy does not surface.
Five: what happens at scale? Klarna's customer-service AI worked fine at low volume. The cracks appeared at scale, when the AI's edge cases became frequent enough that the human-in-the-loop staffing model could not absorb them. The operator who tested the deployment at low volume and rolled out at high volume discovered the gap in production. The operator who ran a representative-sample stress test before scaling caught the gap in pilot.
The honest summary is that "human-in-the-loop" is, in 2024, the only AI promise that almost every vendor will say they have. It is also the AI promise that costs the most to actually deliver. The five questions above are the operator-level diligence that distinguishes the vendors who priced it correctly from the vendors who priced it as marketing copy. Klarna's reversal is the case study. The operators who did not pay attention to the reversal will, of course, be running their own version of it eighteen months from now.
—TJ