5 min read

Microsoft's Metering Pivot

Microsoft's Metering Pivot
Photo by BoliviaInteligente / Unsplash

Programming note: ARPU will return on Friday and take a look at the business economics of neoclouds.

Software Gets an Expense Account

A strange thing happens when software becomes useful enough to act like an employee: it starts demanding an expense account.

At Microsoft's most recent earnings call, Satya Nadella made the direction explicit:

The basic transformation of any per-user business of ours—whether it is productivity, coding, or security—will become a per-user and usage business. That is the best way to think about it.

On the same call, CFO Amy Hood translated that into the language customers will feel:

It'll still have that per-seat license logic, but it'll also have a meter, just like you see in Azure. And it may not all flow through bookings in the same way. You'll just bill for usage.

That is not a minor pricing adjustment. It is Microsoft's attempt to reframe the entire software business for the AI era.

For the last twenty years, the software industry has been built on a single premise: write the code once, and your gross margins scale to 80% because serving the millionth customer costs the same as serving the first.

An AI agent breaks that premise. It doesn't just sit there. It works—reading documents, writing code, monitoring systems around the clock. And work consumes energy. Every thought loop is a tax on a GPU. The software now has an electric bill, and the argument Nadella is advancing is not just about building the agent. It is about who owns the meter.

The Collapse of Certainties

The genius of classic SaaS was not only that it was useful—it was that it was budgetable and predictable. The vendor sold seats at a fixed monthly price: reliable, recurring revenue that scaled with headcount and seldom churned. The buyer got something equally valuable: an IT line item with no surprises. Budget for five hundred seats in January and the December invoice said the same thing. For the CIO, this was the IT expense that never required a conversation with the CFO.

Then agentic AI arrived, and both certainties collapsed.

For the vendor, the problem is marginal cost: an autonomous agent doesn't sleep, and every reasoning loop runs on a power-hungry GPU in a data centre. Flat-fee pricing cannot absorb an input cost that scales with activity.

For the buyer, this means the bill is no longer a function of headcount—something a CIO can easily plan for. In the agentic era, "seats" cease to be a reliable proxy for compute. As Jeetu Patel, product chief at Cisco, explained to the Financial Times:

The amount of infrastructure needed for an agent is meaningfully higher than for a chatbot. For every human you might have 10, 100 or on the aggressive side 1,000 agents . . . They just keep working and that consumes a chunk of [compute].

When a single employee can quietly spawn an army of a thousand bots that never go home for the weekend, budgeting becomes an exercise in guesswork.

The numbers make the problem concrete. The cheapest production models cost around $0.04 per million tokens; the most expensive frontier reasoning models cost upward of $180 per million tokens—a 4,500-times pricing spread.

Two employees on the same seat licence can generate radically different costs: one using AI for basic summarisation might consume 10,000 tokens a day, another using it for code generation might consume 10 million. Same seat. One thousand times the cost. No visibility until the bill arrives.

For the vendor, that is a margin problem. For the customer, it is a budgeting problem–and one made significantly worse by the fact that companies are being asked to underwrite these volatile bills before AI has shown any measurable capacity to actually improve productivity.

The Metered Subscription

The frontier labs are already retreating from the flat-fee trap. In April 2026, OpenAI converted Codex from a fixed per-seat model to pay-as-you-go token billing. Within two weeks, Anthropic restructured its Enterprise plan to a $20 base fee plus metered usage—abruptly ending the flat-rate model that had allowed heavy users to run workloads for a predictable $200 a month.

This is the exact hybrid model Nadella is now preparing to roll out to the rest of the corporate world. Microsoft is changing what a SaaS subscription buys. The seat is becoming a cover charge—it gets you in the door and keeps corporate CFOs sane because they can still budget for a fixed number of licenses. But the consumption meter tucked underneath is where the actual activity gets priced.

It is a fundamental shift. By hiding a variable utility meter underneath a familiar software contract, enterprise software is starting to behave less like predictable rent and more like a teenager with a corporate credit card. It might be doing something highly productive, but you would still like to see the receipts. The uncomfortable question for companies is no longer whether the AI agent works. It is how much the agent works before someone from finance notices.

Signal Stack

The operating reality beneath the headlines.

  • Tokenomics (Citadel Securities) – Economic price theory now applies directly to AI—higher compute and inference costs signal scarcity, create incentives to substitute toward more efficient models, and are rationing scarce capacity toward the domains where AI's marginal productivity actually justifies its marginal cost.
  • The Token Bill Comes Due (TechCrunch) - Engineers who used the most AI tokens were roughly twice as productive as those who used it less—but spent ten times as many tokens to get there, and produced more bugs and rewrites in the process; whether the spend pays off, researchers say, comes down to business value most companies still cannot measure.

📺 On Our Channel

Apple Siri Meets Memory Shortage

The same factories that make memory for iPhones make high-bandwidth memory for Nvidia's AI accelerators—and when Nvidia arrives with multi-year contracts and premium prices, Apple loses the bidding before it starts. The result: Apple's most powerful Siri only runs on three iPhone models, not because of software or chip design, but because standard iPhones don't have enough memory. We broke down how the AI data center buildout is quietly limiting what the world's most valuable consumer hardware company can ship.

Watch ARPU's deep dive on YouTube (8 Mins)


📊 Data > Narrative

We pull key data points to show you the mathematical reality of what's happening in tech.

The Power Math Behind AI Scaling

  • The Data: Today's most demanding frontier training run required 27 megawatts of power, running 16,000 H100 GPUs. Epoch AI estimates sustaining the current 4x-per-year growth rate through 2030 would require a training run roughly 5,000 times larger. Hardware will not grow more power-hungry at the same pace: improvements in chip efficiency, lower-precision training, and longer training durations are expected to deliver a combined 24x efficiency gain by 2030. Even so, 5,000 divided by 24 leaves a roughly 200x increase in raw power demand—taking a single frontier training run from 27 megawatts today to an estimated 6 gigawatts by 2030.
  • The Takeaway: Six gigawatts is not a data center number. It is a power plant number. The largest industrial power consumers today—aluminum smelters, steel mills—peak at around one gigawatt. Epoch AI identifies power as the most likely binding constraint on AI scaling before 2030, more limiting than chip supply or training data—not because the electricity cannot eventually be generated, but because building the grid infrastructure to deliver 6 gigawatts to a single computing cluster requires three to five years of planning and construction.

You received this message because you are subscribed to ARPU newsletter. If a friend forwarded you this message, sign up here to get it in your inbox.