Value for Money Is All You Need
Value For Money is All You Need
A reflection on the future of token consumption in artificial intelligence.
Token consumption now sits at the center of the growing use of artificial intelligence by businesses and individuals alike.
The "TokenMaxxing" trap In the early days, the trend was to maximize token consumption from proprietary LLMs, regardless of cost — a practice seen as a marker of performance for the user, the employee, or the company. This phenomenon, known as "TokenMaxxing," reportedly exhausted Uber's entire annual budget in under a year.
Faced with the enormous financial cost this TokenMaxxing generated, many companies and individuals turned to lower-cost LLMs to preserve their budgets — fueling the rise of Chinese open-source LLMs such as DeepSeek, in line with Harvard professor Clayton Christensen's theory that disruptive innovation can conquer a market through low prices.
Users thus found themselves facing a dilemma: choose a highly capable but token-expensive proprietary model, or a less capable but more budget-friendly open-source model.
The temptation of dumping To resolve this dilemma, Sam Altman, CEO of OpenAI, promised to lower the cost of OpenAI's tokens — aiming to stand out from the competition, gain ground in the AI space, and make his highly capable models more accessible in terms of token cost.
While commendable, this initiative exposes OpenAI to two major risks:
- A considerable financial risk: this token dumping could negatively impact OpenAI's profitability, making the strategy difficult to sustain over time.
- A market risk: dumping in no way guarantees an increase in OpenAI's market share against a competitor like Anthropic, since users remain willing to pay a high price if they can afford it — and if the expensive tokens they purchase generate returns far superior to those of cheaper tokens.
Current initiatives around token utilization To resolve this cost-versus-quality trade-off faced by users, a new philosophy is now emerging: that of cost efficiency. Several interesting initiatives reflect this shift:
- OpenRouter merges models in an attempt to reduce costs while still providing access to the most powerful models available — but the operation of its AI agents generates considerable hidden costs.
- Chinese open-source models such as GLM 5.2 are highly capable and cheaper than proprietary models from OpenAI or Anthropic, while still being notably more expensive than other open-source models.
- Ponytail strips away everything superfluous in code to preserve only the essential, thereby reducing token cost while preserving quality regardless of the LLM used — but it risks being too minimalist and insufficiently flexible to understand the context in which a user introduced lines of code that are essential to them, but which Ponytail might judge as superfluous.
- Headroom promises, through compression, to cut token costs by 95% — but the hidden costs tied to running its AI agent risk undermining this commendable goal.
Ultimately, all of these projects are commendable and worth encouraging, as they help address a problem that still stands in the way of the broader adoption of artificial intelligence.
The real challenge: Value For Money In my view, the real challenge lies neither in price, nor in quality, nor in a performance-cost trade-off, nor even in cost efficiency. The real challenge lies in Value For Money.
Value For Money rests on three cumulative criteria:
- Cost
- Quality
- Protection against risk(s)
Together, these three criteria deliver the best quality, at the lowest cost, with the least risk.
A new philosophy Value For Money is the new paradigm that should guide AI labs and companies in how they approach token usage. That is why I am currently working on a project — soon to be available — to help remove, together with anyone willing to join me on this journey, the obstacle that token consumption represents.
