Listen to this story
Cohere’s Command Beta model gained the top spot in the Stanford HELM (Holistic Evaluation of Language Models) earlier this month. The startup’s generative model that’s conditioned to respond well to single-statement commands stood out among 36 LLM models, including Meta’s Galactica, OpenAI’s Davinci, Google’s Flan, Bloom and others.
Despite the accolade, Ed Grefenstette, the Head of Machine Learning at Cohere, remained admirably modest, denoting the achievement as a “nice marketing moment”.
“Leaderboards are always things that you should take with a grain of salt. We do not want to be complacent and imagine that just because we topped this leaderboard, we will be better than other models close behind us,” the NLP pundit said. He is more excited about the week-on-week progress of their models than the leading position.
Additionally, he stated that OpenAI’s latest, GPT-4—undeniably an extremely strong model—was not benchmarked at the time. So, he is under no illusion and is sure that the result will be different in the next run if Stanford benchmarks the current models against GPT-4.
The Philosopher’s Toolkit
“My interest in artificial intelligence piqued through studying philosophy of mind during my undergraduate studies, and reading science fiction and cyberpunk novels in my teens. I decided to pursue computer science due to the lack of job opportunities in philosophy,” said the AI stalwart.
As a budding philosopher, Grefenstette was fascinated by the question of what makes humans and other species intelligent. He wanted to understand how we use intelligence to reason about not just the physical world but also about concepts and metaphysics. However, he soon realised that this was a difficult task and began to think about how we could use artificial intelligence to aid in reasoning.
While completing his doctoral work in natural language processing at Oxford, Grefenstette discovered that the approach he and his colleagues were pioneering was similar to rudimentary neural networks. In 2014, they formed ‘Dark Blue Labs’ to commercialise their ideas but got acquired by Google within a few months.
Six months prior to the acquisition, the tech giant had also purchased British AI company, ‘DeepMind’, so Grefenstette merged his team into DeepMind and helped establish the NLP group, as well as a programme synthesis and understanding group.
Quenching the Startup Spirit
During his time at DeepMind, the NLP expert witnessed the exponential growth of the company firsthand, from 80 employees to over 1,000 in only four years. This growth, he notes, can make it difficult for individual voices to be heard and can change the culture and dynamics of a workplace.
Despite considering the idea of launching a startup, he ultimately decided to join Facebook AI Research in London, where he helped build a new research lab. “That seemed like a nice compromise between the complete control of entrepreneurship and the idea to start something small and grow it into something big,” he said.
However, after three years, he was once again drawn to “the earlier stages of an organisation or smaller groups because that’s usually where there’s the most sort of potential to push the shape, the direction of things”. Cohere, with its focus on conversational AI and potential for growth, proved to be the fit for Grefenstette’s entrepreneurial spirit.
OpenAI Steals The Thunder
Grefenstette lauded OpenAI’s fantastic progress. However, he also said, “With no disrespect intended to OpenAI, who have set a new technical ceiling, a common trope in ML is that no one is more than a few weeks behind anyone else (when methods are published), or perhaps a few months behind (when they are not). A lot of stakeholders in the field are now looking to close the gap created by OpenAI’s head-start”.
Meanwhile, he says he’s interested in thinking about the next ChatGPT moment. Something that is disruptive, delightful, and helpful. “If I give you the specifics, we will give our competitors an edge at this point,” he said with a chuckle.
We, as humans, use language not only to communicate but also to plan, transact, negotiate and perform several activities that characterise intelligence. It’s how we explain concepts and reason with ourselves. Providing certain fragments of these abilities to computers would be a way to be able to integrate them into a significantly wider range of activities, Grefenstette believes.
This feat was becoming more feasible. However, it wasn’t immediately clear for Cohere how to make this technology connect to the broader market. “Fortunately, OpenAI did that for the sector with ChatGPT,” he said. “The important thing was connecting to the non-technical users and that was supremely beneficial for OpenAI. Now, we have South Park episodes about ChatGPT. We wouldn’t have guessed that a year ago but it was a fantastic moment for both them and their competitors,” Grefenstette added.
Engineering and science are not domains where ethics is orthogonal. Considering the ethical and societal implications of technology is not a dialogue that should be happening in silos, he believes. “These issues are also a matter of education for the broader population, and for regulation from the government. It cannot be left entirely in the hands of the technologists to resolve these issues,” Grefenstette added.
Pinpointing the concern of language models being used for medical applications, he said, “People should be educated that the risk of hallucination is an intrinsic aspect. These models are trained not to tell the truth, but rather to say something that’s plausible given the data it was trained on. However, plausibility is different from truth, although sometimes these are one and the same. This occasional overlap has created a vulnerability in users, who come to expect truth as a result, but understanding that—that is not a guarantee, that’s fundamental,” he added. “Healthy scepticism needs to be ingrained within the broader population”.
“I qualify myself as a sceptic,” said Grefenstette, when asked about his views on the race for AI companies to achieve human level intelligence, also infamously known as AGI. “Humans are very general, but not completely. We’re tailored to what has helped us survive under the constraints of the environment. So, we’re not the ceiling of what is possible in physics and biology,” he explained.