Google's evolving TPU strategy

Google built the first Tensor Processing Unit (TPUs) because general‑purpose chips were too slow and too power‑hungry for the surge of machine‑learning inference inside Google Search, Ads, Photos, speech etc. As Moore's Law, the long-standing observation of exponential growth in transistor density, began to falter, the need for a more specialized and efficient solution became paramount. Google Cloud in a blog post from 2017 explained the history behind TPUs:

Although Google considered building an Application-Specific Integrated Circuit (ASIC) for neural networks as early as 2006, the situation became urgent in 2013. That’s when we realized that the fast-growing computational demands of neural networks could require us to double the number of data centers we operate.

Usually, ASIC development takes several years. In the case of the TPU, however, we designed, verified, built and deployed the processor to our data centers in just 15 months.

15 months!! While listening to the recent Acquired episode on Alphabet, I was reminded that it was Google which was able to gather the true “Superintelligence” team at a likely fraction of the cost of Meta did this year. In fact, we may not see such talent density again for several decades. From Acquired (slightly edited for clarity):

What if I told you that, between 2015 and 2016—the 12 months after the Alphabet transition—all of the following people were Google employees: Alex Krizhevsky of AlexNet (often cited as the dawn of modern machine-learning AI); his PhD advisor, Geoffrey Hinton (the “godfather of AI” and his collaborator on the AlexNet paper); Ilya Sutskever (founding scientist of OpenAI); Dario Amodei (co-founder, with his sister, of Anthropic); Andrej Karpathy (until recently chief AI scientist at Tesla); Chris Olah; Noam Shazeer; Ian Goodfellow; and, of course, the co-founders of DeepMind—acquired by Google in 2014—Demis Hassabis, Shane Legg, and Mustafa Suleyman (who runs AI at Microsoft today); Andrew Ng from Stanford; and, in addition to all of those people, the authors of the Transformer paper, since Google invented the Transformer and published the paper in June 2017.

Of course, most of these people left Google since then. It is easy to blame Google’s culture and what not for such an exodus, but few people seem to be willing to ponder how on earth Google was able to assemble such a talent dense team in the first place. The reality is among all the big tech, Google was the quintessence playground for most talented technologists in much of the 2000s and 2010s and while critics aren’t necessarily wrong about the general decadence and increasingly bureaucratic culture in the company over time, it was always going to be almost impossible to retain such a highly talent dense team. Moreover, a two-decade monopoly would probably be corrosive to most companies culture anyway; perhaps the threat that their money printing machine i.e. Search faces in the AI world was a great time for a cultural reset. Looking at Google’s shipping cadence since some early fiascos post-ChatGPT likely hints at such a reset.

Anyways, back to TPUs. The initial goal was not to sell chips; it was to keep Google’s products fast and keep power bills sane. Training‑capable TPUs (v2/v3) came later, followed by large, pod‑scale systems (v4/v5) and, most recently, an inference‑first generation (Ironwood).

Unlike Nvidia, which became the largest company in the world by selling its GPUs, Google chose to keep its TPUs proprietary. I will discuss the rationale for such a strategy and why they may be changing it now behind the paywall.


In addition to "Daily Dose" (yes, DAILY) like this, MBI Deep Dives publishes one Deep Dive on a publicly listed company every month. You can find all the 62 Deep Dives here.


This post is for paying subscribers only

Already have an account? Sign in.

Subscribe to MBI Deep Dives

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe