Why Google Is Fleeing Scale AI After Meta's Deal
Sign up for ARPU: Stay ahead of the curve on tech business trends.
For the past two years, the story of the great AI boom has been one of hardware. It’s a race for Nvidia GPUs, for access to TSMC's cutting-edge fabs, for enough electricity to power colossal data centers. But this week, a different kind of scarcity crisis erupted, and it has nothing to do with silicon. It’s about people. Specifically, people who are very good at explaining things to a machine.
The drama centers on Scale AI, a startup whose business is providing high-quality training data for AI models. Its largest customer, Google, is now planning to cut ties after rival Meta took a massive 49% stake in the company. Other major customers, including Microsoft and Elon Musk's xAI, are also reportedly looking to back away. The exodus shows that in the frenzied race to build the smartest AI, access to the best human-annotated data has become a critical, and now fiercely contested, strategic chokepoint.
What is a data-labeling company and why is it so important?
At a basic level, data labeling is how an AI model learns. In the early days, this meant drawing boxes around pictures of cats. But as models have grown more sophisticated, so has the data they need. It's no longer enough to just feed a large language model the entire internet; to make it truly smart—to get it to reason, to perform complex tasks, to not "hallucinate"—it needs to be taught by experts.
This is what Scale AI does. The company built a network of human trainers with specialized knowledge—PhDs, scientists, historians—who annotate complex datasets in a process called Reinforcement Learning from Human Feedback (RLHF). This "post-training" refinement is the secret sauce. It's how a model like Google's Gemini learns to be a better coder, or how OpenAI's models learn to tackle graduate-level reasoning problems. This work is so valuable that a single, nuanced annotation can reportedly cost as much as $100. Google was planning to spend around $200 million on these services from Scale AI this year alone.
The conflict at the heart of the Meta-Scale AI deal
The problem, for Google and others, is that Meta isn’t just another investor; it’s one of their biggest competitors in the race for AI supremacy. By taking a 49% stake and installing Scale’s CEO in a top role at Meta, the company has effectively absorbed a key supplier to the entire industry. Here’s Reuters on the core concern:
Companies that compete with Meta in developing cutting-edge AI models are concerned that doing business with Scale could expose their research priorities and road map to a rival, five sources said. By contracting with Scale AI, customers often share proprietary data as well as prototype products for which Scale’s workers are providing data-labeling services. With Meta now taking a 49% stake, AI companies are concerned that one of their chief rivals could gain knowledge about their business strategy and technical blueprints.
Essentially, continuing to work with Scale AI would be like Ford letting GM's top engineers and strategists walk around its secret R&D facility for next-generation vehicles. You are handing your chief rival a direct, real-time view into your most sensitive work, your product roadmap, and the very weaknesses you are trying to fix in your models. In a race this intense, that's an unacceptable risk.
What does this mean for the AI supply chain?
Meta’s move has instantly balkanized the data-labeling market. The concept of a neutral, third-party supplier that serves all the major labs may now be a thing of the past. Jonathan Siddharth, CEO of Scale competitor Turing, put it bluntly: "Leading AI labs are realizing neutrality is no longer optional, it's essential."
This forces a major strategic shift for companies like Google, Microsoft, and OpenAI. They now have two choices: find a new, credibly neutral data partner, or bring this critical function in-house. Both are already happening. The fallout from the Meta-Scale deal has created a massive opportunity for smaller, independent rivals. The CEO of Labelbox told Reuters he expects to generate "hundreds of millions of new revenue" from fleeing customers, while Handshake, another competitor, saw its demand from top AI labs "triple overnight." At the same time, many labs are now looking to build their own internal teams of expert data-labelers, seeking the ultimate security of keeping their most valuable data entirely within their own walls.
The AI race has always been about access to scarce resources. For a long time, that meant compute. Now, the battle has expanded. The world's most advanced AI models are in a desperate competition for the world's most advanced human expertise, and the supply chain that provides it has just become another battlefield.
Reference Shelf:
Google, Scale AI's largest customer, plans split after Meta deal, sources say (Reuters)
Google and OpenAI Are Frenemies Now (ARPU)
What's Behind Meta's Big Bet on Superintelligence? (ARPU)