4 min read

China's Efficiency Hack

China's Efficiency Hack
Photo by Solen Feyissa / Unsplash

The Post-Text Pivot

We talked recently about how the "Scale is All You Need" dogma—the Silicon Valley belief that you just need vastly more compute to process vastly more data—is running headfirst into the laws of physics. It turns out that even if you have infinite money, you cannot bribe a power grid to build itself overnight.

But in China, that dogma ran into a wall years ago, courtesy of the US Department of Commerce.

Because access to the best chips is restricted and capital is tighter, Chinese labs haven't had the luxury of relying on brute force. Their theory has necessarily shifted to "Efficiency is All You Need." Or, perhaps more accurately: "Efficiency is All We Have."

And so we have the arrival of Moonshot AI. Earlier this month, the Beijing-based lab released a new reasoning model, Kimi k2, which has challenged the Western hierarchy. On industry benchmarks, it ranks second globally, sitting just behind OpenAI’s GPT-5.1 and beating the latest models from Anthropic and Elon Musk’s xAI.

That is impressive, but the headline-grabbing number was the price tag. A CNBC report suggested that the model cost roughly $4.6 million to train.

Now, you should squint at that number. As Moonshot's own CEO has clarified, that figure isn't exactly "official." It likely represents the cost of the final, successful training run, while conveniently excluding the massive costs of R&D, failed experiments, salaries, and data licensing that preceded it. It is a classic bit of startup marketing math: quoting the cost of the gas while ignoring the cost of building the car.

But even with that asterisk, the signal is real. The contrast with the West is stark. OpenAI and Microsoft are planning data center projects that cost $100 billion. Moonshot is operating in a world of constraint, driven by US export bans that deny them the best hardware. Necessity is the mother of invention, and in this case, hardware starvation is forcing Chinese labs to adopt radical architectural efficiencies that bloated US labs haven't needed to prioritize.

And the most interesting efficiency hack isn't just about math; it's about changing the fundamental atomic unit of AI.

Last month, DeepSeek—the OG Chinese contender known for shocking the market with low-cost training figures—released an open-source model that proposes a radical shift. For years, the standard way AI processed information was by breaking words into "tokens" (little chunks of text). This works, but it is computationally expensive. As conversations get longer, the model suffers from "context rot"—it forgets what you said ten minutes ago because it's drowning in text tokens.

DeepSeek's solution? Stop reading and start looking.

Their new architecture uses "Contexts Optical Compression." Instead of feeding the model thousands of text tokens, it essentially takes a high-resolution picture of the text and feeds it into the model as visual data. The math is striking: for a computer, "looking" at a page of text is about 10x more efficient than "reading" it via traditional tokenization.

It is also a humiliating realization for humanity: apparently, reading really is hard, and even supercomputers would rather just look at the pictures. It’s the difference between transcribing a book by hand versus just photocopying the pages. One takes forever; the other is instant.

But the visual approach also unlocks a smarter way to handle memory. As DeepSeek researchers explain, the model uses a "tiered compression" system where older or less critical information is stored as a lower-resolution, slightly blurry image. It doesn't need to remember the exact hexadecimal code of every letter; it just needs the fuzzy jpeg of the concept.

This is getting serious attention from the highest levels of the industry. Andrej Karpathy, a founding member of OpenAI and former AI chief at Tesla, recently tweeted that text tokens might be "wasteful and just terrible."

This is the divergence. The US model is building bigger brains by building bigger power plants. The Chinese model, forced by sanctions and fierce domestic competition, is building better eyes and leaner code. If DeepSeek is right, the future of Large Language Models might not involve much language at all. It might just be a machine looking at pictures of words, learning faster than everyone else because it isn't bogged down by the act of reading.

More on AI Compute

  • China's AI Providers Expected to Invest $70 Billion in Data Centers Amid Overseas Expansion (Goldman Sachs)
  • Mapping the Neocloud Landscape (ARPU)

On Our Radar

Our Intelligence Desk connects the dots across functions—from GTM to Operations—and delivers intelligence tailored for specific roles. Learn more about our bespoke streams.

OpenAI's Physical Layer

  • The Headline: OpenAI Partners With Foxconn to Design Hardware for Data Centers (Bloomberg)
  • ARPU's Take: OpenAI is systematically taking control of its entire physical supply chain. By partnering directly with Foxconn to co-design server racks and power systems, OpenAI effectively bypasses traditional server brands, signaling that standard data center infrastructure is no longer sufficient for the extreme density and power requirements of its next-generation models.
  • The Operations Implication: OpenAI is executing a vertical integration strategy to bypass standard server OEMs (like Dell or HPE). By co-designing racks and power systems directly with Foxconn, OpenAI gains control over physical infrastructure performance and cost, treating data center design as a proprietary advantage rather than a commodity purchase. This mirrors the custom silicon trend, extending it to custom metal, forcing rivals to compete not just on model weights, but on the efficiency of the physical iron running them.

Mandating the Multi-Cloud

  • The Headline: EU Invokes Digital Markets Act to Probe Whether Amazon and Microsoft Must Enable Rival Compatibility (Reuters)
  • ARPU's Take: The EU is targeting the most profitable feature of the cloud business model: vendor lock-in. By potentially labeling AWS and Azure as "gatekeepers," regulators aim to force technical interoperability, effectively trying to turn proprietary cloud stacks into commoditized utilities where moving data is as easy as switching phone carriers.
  • The Go-to-Market Implication: This regulatory action directly targets the "economic moat" of the hyperscalers: high switching costs. If designated as gatekeepers, AWS and Azure faces a mandated dismantling of sticky retention tactics like data egress fees and restrictive software licensing bundles. This shifts the market dynamic from "capture and retain" to "open competition," potentially commoditizing the infrastructure layer and creating a massive opportunity for multi-cloud management vendors and smaller challengers to poach customers without penalty.

P.S. Tracking these kinds of complex, cross-functional signals is what we do. If you have a specific intelligence challenge that goes beyond the headlines, get in touch to design your custom intelligence.


You received this message because you are subscribed to ARPU newsletter. If a friend forwarded you this message, sign up here to get it in your inbox.