Urs Hölzle has a big job. As senior vice president of technical infrastructure at Google, he’s in charge of the hundreds of thousands of servers in data centers spread across the planet to power the company’s ever widening range of services.
He’s also the person that the company’s engineers turn to when all that computing power turns out not to be enough.
Today at the 2017 Wired Business Conference in New York, Hölze explained that even with its enormous resources, Google has had to find ways to economize its operations in order to meet its ambitious goals. Most recently, he said, the company was forced to start building its own artificial intelligence chips because the company’s existing infrastructure just wouldn’t cut it.
Around five years ago, Jeff Dean, who ran Google’s artificial intelligence group, realized that his team’s technique for speech recognition was getting really good. So good in fact, that he thought it was ready to move from the lab to the real world by powering Android’s voice-control system.
But when Dean and Hölzle ran the numbers, they realized that if every Android user in the world used about three minutes of voice recognition time per day, Google would need twice as much computing power to handle it all. The world’s largest computing infrastructure, in other words, would have to double in size.
“Even for Google that is not something you can afford, because Android is free, Android speech recognition is free, and you want to keep it free, and you can’t double your infrastructure to do that,” Hölzle says.
What Google decided to do instead, Hölzle said, is create a whole new type of chip specialized exclusively for machine learning. He likens traditional CPU chips to everyday cars—they have to do a lot of things relatively well to make sure you get where you’r going. An AI chip, on the other hand, has to do just one thing exceptionally well.
“What we built was the equivalent of a drag race car, it can only do one thing, go straight as fast as it can,” he says. “Everything else it is really, really bad at, but this one thing it is super good at.”
Google’s custom chips could handle AI tasks far more efficiently than traditional chips, which meant the company could support not just voice recognition, but a broad range of other tasks as well without breaking the bank.
Pattern Recognition
This pattern has repeated itself again and again during Hölzle’s time at Google. He says that when he started at the company in 1999 (he was somewhere between the seventh and 11th employee hired by Google, depending on how you count), Google only had around 50 servers and was straining to support the number of search queries it received each day. But even with $25 million in venture funding, the company couldn’t afford to buy enough ready-made servers to meet its growing demand.
“If we had done it with the machines, the servers, that people were using, professional servers, real servers, that would have blown our $25 million in an instant,” he says. “It really was not an option, so we were forced to look for other ways to do the same thing more cheaply.”
So Hölzle and company built their own servers out of cheap parts. Each individual server was less powerful and reliable than a professional-grade machine, but together the clusters of computers they assembled was more powerful and reliable than what they could purchased otherwise. Google didn’t invent the idea of using big clusters of cheap machines in lieu of more expensive hardware—that honor might go to the nearly forgotten search engine Inktomi—but it did popularize the model by proving that it could work on a massive scale.
Hölzle and his team had to do something similar years later when it found that off-the-shelf networking gear no longer met its needs. So few companies needed switches that could support the number machines Google had that no established networking company was interested in producing them. So, once again, Hölzle and his team had to build their own gear—something that other companies, like Facebook, now do as well.
“These decisions become a lot easier if all the other alternatives are non-viable,” Hölzle says. “It’s not necessarily that we’re somehow bolder or more insightful, but it’s actually that for many of these things in our history, it was almost a forced choice, you didn’t really have a viable alternative that you could buy.”
But Hölzle probably isn’t giving himself enough credit. Most people, after exhausting all the viable options, would conclude that their task is impossible. When Hölzle ran out of options, he created new ones.