Listen to this story
A month ago, reports emerged that Sequoia Capital had written off its stake in Graphcore, a semiconductor company based in Bristol. Graphcore has raised $682M to compete with US-based firms such as Nvidia Corp and Advanced Micro Devices (AMD), which dominate the market for AI chips.
Unable to compete with its US rivals who were offering GPUs at ‘low cost’, owing to government subsidies, the company had to go through broad restructuring plans. As a result, it had to shut down its Oslo office. Next thing you know, a number of highly skilled engineers from the Oslo team have joined Meta for the design and development of supercomputing systems to support AI and machine learning at scale in Meta’s data centers.
Struggling to keep up with its US competition, this chip startup faced mounting problems, including ballooning losses, sluggish sales, and sweeping job cuts. The deal to supply processors to Microsoft for its cloud computing platform, first announced in 2019, also eventually came to an end. “They seem to have developed good asset but it is going nowhere,” said Sravan Kundojjala, an independent semiconductor analyst, in explaining Graphcore’s predicament.
While Graphcore’s struggles have been unfortunate, the situation presents a timely opportunity for someone else. Despite significant investments, Meta has been slow in catching up to the AI hardware and software systems for its main business, hindering its ability to keep pace with innovation at scale. Here, scooping up people who had worked on AI-specific networking technology in Graphcore gives Meta a huge boost.
Currently, Meta has teamed up with Nvidia to build a massive AI supercomputer called the AI Research SuperCluster (RSC). The RSC is so powerful that it can train AI models with over one trillion parameters, making it the largest installation of Nvidia DGX A100 systems by a customer.
But, the latest news around it suggests that the chip won’t complete until 2025. Reason being that the company has been reworking on a number of under-construction data center projects for new design. Meta’s Nordic communications manager told DCD that supporting AI workloads at scale requires a different type of data center than those built for regular online services, and the current site is inadequate for these needs.
Additionally, as per reports, this pairing with Nvidia is costing Meta billions of dollars. Alternatively, Graphcore updated models of its multi-processor computers, IPU-POD, running on the Bow chip, are claimed to be five times faster than comparable DGX machines from NVIDIA at half the price.
This means that we can potentially see an acquisition or Meta and Graphcore grouping together to help build in-house chips, leveraging Graphcore’s talent and Meta’s planned investment in AI research.
On its earnings call last week, CEO Mark Zuckerberg said “A.I.” more than 20 times during his opening presentation, and CFO Susan Li said the company expects to spend about $30 billion to $33 billion this year, with a focus towards increased investment in capacity for its Generative AI initiatives.
According to two sources who spoke with Reuters, Meta has an in-house unit that designs various chips to accelerate and optimize AI operations, including a network chip that performs a sort of air traffic control function for servers. This network chip is essential for systems running large AI models which consist of multiple chips strung together to distribute the workload.
The Fundamental AI Research (FAIR) team at Meta AI have been rigorous in open sourcing their models to the academic community, with models like LLaMA. Meanwhile, Meta has also been releasing models for accurate speech translation and object segmentation. Despite this, the company has not released any generative AI products for the consumers.
Meta’s lagging position was notably apparent when the White House invited CEOs to discuss AI, but Zuckerberg was conspicuously excluded. According to CNN, a Biden administration official explained that the meeting focused on companies currently leading in the space, especially on the consumer-facing product side.
Meta is eager to demonstrate its leadership in the AI industry and prove that it deserves a place at the table among the top companies in the space. This is why a potential collaboration with Graphcore can prove to be extremely beneficial for the two companies.
Winter is coming
AI is at its peak, and there is a proliferation of chip companies. “At some point, there were about 35 GPU companies, and 40 to 50 network processors, and right now there’s about 50 or a 100 AI companies,” said Jim Keller, CEO, Tenstorrent Inc. According to him, there is an exploration phase, where a lot of companies are formed, but then there’s also a consolidation phase.
“It doesn’t mean only the winner survives. A lot of times what happens is some two or three companies will group together, or maybe few will go bankrupt. It is a good spot to repurpose something else,” said Keller.
“So I think we’re gonna see a proliferation. During high rates of change, technology areas that have more proliferation can explore more space. So the probability of a winter comes out of the area where more people are doing stuff that’s hot, the place where people are doing the best thing,” he concluded.