OpenAI Executive Explains the Insatiable Appetite For AI Chips
Peter Hoeschele, who runs OpenAI’s Stargate data center team, said at an event last week that the company’s models are essentially in constant training mode. That’s a change from the past, when OpenAI would stop training them when they reached a certain point.
“Let’s stop talking about training versus inference,” Hoeschele said after Oracle co-CEO Clay Magouyrk asked him how OpenAI allocates computing resources between training the models and running them (otherwise known as inference) for its 800 million-plus users.
“We are in a new regime now where the models are ideally constantly running, and constantly going through sampling and training, and getting better all the time,” he said.
Hoeschele was referring to the rise of “test-time compute,” the idea that today’s models can improve their responses to customers by using more computational resources at the moment they’re answering a query, not just during the large-scale “pretraining” of models involving large clusters of advanced Nvidia graphics processing units.