From Matt Levine’s Money Stuff newsletter we read about an interesting transition in the cost model of software:

A lot of the biggest and most successful companies now are enormously capital-intensive. They are artificial-intelligence hyperscalers, and their business model is like “build nuclear power plants and orbital data centers and massive chip fabrication facilities.” After years in which the cutting edge of the economy was nearly zero-marginal-cost software, now the cutting edge of the economy is extremely capital-intensive, uh, software.

All of that capital outlay only makes sense if it is matched by a proportional revenue stream. This is the Sam Altman thesis: “We see a future where intelligence is a utility, like electricity or water, and people buy it from us on a meter.”

An electricity meter

The problem with meters from the point of view of users is that they keep running, as Uber found out when developers blew through Uber’s entire 2026 AI token budget in just four months. A mystery company reportedly spent half a billion dollars on Claude Code in a single month.

Scale to zero — or to the stars

This problem of open-ended cost structures is not a new one that is specific to AI. The most recent iteration of the billing dilemma was in serverless platforms.

Cloud software is roughly divided into three layers:

  • IaaS, or Infrastructure as a Service: basically, virtual servers in the cloud. Apart from them running in someone else’s datacenter, these behave like normal servers: you log into them directly, deploy software, reboot them, and so on. Your bill is for a certain number of servers.
  • SaaS, or Software as a Service: you have no idea where the software is running, you just connect via a web browser and get to work. Your bill is for a certain number of users, or “seats”.
  • PaaS, or Platform as a Service: this is an intermediate level of abstraction, where you are not connecting directly to a server, but to application software running on top of one or more servers. The details are not important or even visible to you; it’s just “compute” (short-hand for “computational power/capacity”), but your bill might still be for a certain number of servers.

Serverless billing applies to that last model. The idea is that, since with PaaS you don’t access the servers directly, you should just pay for however much platform capacity you use, rather than for a fixed pool of capacity, as you would with per-server billing. This approach is pitched as being particularly attractive to startups: you don’t know if your offering will take off, so you don’t want to commit to high up-front costs — and if you do hit the big time and your thing is blowing up, you don’t want to be constrained by capacity.

Here’s the catch: serverless compute is significantly more expensive than pre-purchased units of compute billed as servers. The reason is that the risk of paying for idle capacity doesn’t go away, it just gets transferred from the hopeful startup to the cloud provider. Of course the cloud provider is aggregating demand and betting that no more than a certain portion of its customers will suddenly need a whole lot of compute capacity at once, but they are also charging a risk premium for their trouble. Basically, it’s an insurance model.

This means that if you do know your demand profile, you are better off not using the metered serverless model, but instead pre-purchasing the capacity that you know you need. You may even be able to mix and match, with baseline guaranteed capacity at one price point and a buffer on top that is charged at surge pricing rates if it turns out that you do need it.

But right now that is not how any of the frontier models work. They consume “tokens”, and they do so at a rate that is not always easy to predict, and which can be affected by non-obvious architectural choices.1 In other words, the meter is always running.

What if you don’t have coins to feed the meter?

The problems extend beyond commercial software. At least in that world there is a revenue stream. As long as the token budget is less than the revenue which the token burn enables, the business case still stands up. But what about open-source software, or other non-commercial models? It’s one thing for coders to donate their time to projects, but even then, many projects are in trouble, with volunteer maintainers struggling to pay the bills even for projects that are foundational to many companies’ operations.

The famous XKCD cartoon about all modern digital infrastructure relying on a project some random person in Nebraska has been thanklessly maintaining since 2003 https://xkcd.com/2347/

If widespread use of code-generating AI tools becomes the norm, and those tools burn through enough tokens that working on them comes at substantial financial expense, that equation starts to become impossible. This is one of the problems with Anthropic’s Mythos bug-finding AI model: finding a bug in a piece of open-source software is one thing, but patching it with AI tools would require a bunch of tokens, which a volunteer-run organisation may not have immediately available. And that does not even touch on the problem of deploying a fix once it has been developed, which may be especially hard for open-source components that are embedded deeply in other offerings.

Watching the meter

This is why it is particularly interesting that S&P Dow Jones Indices decided against fast-tracking SpaceX, Anthropic, and OpenAI into the S&P 500. The S&P 500 is what many index funds use; if OpenAI et al are not in the index, they do not have access to funds invested that way. The reverse is also true, of course, but investors always have the option of buying stock in those companies actively; the whole point of index funds is they are passive, and their investors don’t really want to worry about the contents of the fund on a day-to-day basis, or whether some overweight proportion of it is suddenly a massive bet that an unprofitable endeavour can become profitable within a reasonable timespan.

This choice by the S&P has been characterised as a bet against these companies making it; I am far from an investment professional, but I read it instead as a refusal to be bounced into making an exception on the basis of hype. Once these companies have been publicly traded for a year in a process called “seasoning”, they can be considered for inclusion in the S&P 500 index.

A taxi meter

The reason to take this “wait and see” approach is that the thesis of companies and individuals continually topping up the meter on these AI services is far from proven. With more and more stories coming out of spiralling token bills, there is now a drive to manage AI’s runaway costs:

“In April and May, I started hearing from companies: ‘Oh my god, we are 3x over our entire 2026 token budget and it’s only April,’” J.R. Storment, executive director of the FinOps Foundation, a project under the Linux Foundation, told TechCrunch. “We started hearing existential crises, and the whole conversation shifted from tokenmaxxing and ‘go fast’ to ‘we need guardrails, how do we control this?’”

The big AI labs’ problem is that AI has advanced enough that many AI tasks do not need the latest and greatest frontier models. Marco Arment, creator of the Overcast podcast app, built a transcription service for every podcast on Earth using a rack full of Mac Minis — and the base model, at that. Sure, Marco had to come up with the up-front cost of the hardware, but he doesn’t have to worry about huge open-ended costs for using a metered service forever.

This sort of offline usage is a problem for the business model of the frontier labs precisely because it does not generate the ongoing ever-growing token revenue which their stock market valuation is built on.

Offline AI models are also the reason why any attempt at AI regulation that assumes the ability to prevent certain uses, or its use by certain groups, is doomed to failure. That doesn’t mean regulation is not worth doing, mind: it’s perfectly reasonable to say that the Instagram app should not have a built-in feature to “nudify” pictures that people post there. On the other hand, we should also not expect that a ban on built-in features like that will eliminate abuse entirely. The reality is that bad people will continue to find ways to be bad. There probably do need to be controls on AI, but more in the way that we have controls on fertiliser, enforcing regulation and tracking at the point of sale.

Regardless, the lack of clarity around what the future usage patterns for the frontier AI models will be is the reason why the S&P 500 is not taking on the AI bet, or at least, not right now. They want to let things play out for a year, and then see what happens. And if you really do want to buy stock in SpaceX, Anthropic, and OpenAI in the meantime, you still have the choice of going and doing that directly, actively, rather than have it included willy-nilly in a passively-managed index fund, at least while the future outcome is still so uncertain.


🖼️  Photos by Dylan Michaud and Simone Dinoia on Unsplash

  1. Yes yes, you can pre-purchase tokens at some discount, but the mechanism of token consumption is still very opaque, and the use-it-or-lose-it ratchet is much more aggressive than most of the existing cloud pricing models. It’s more of a financial arbitrage than a meaningfully different pricing model.