“Technical artefacts, foundation models included, are inherently political, so the research about them has a socio-political context, not solely a technical one.”
Google’s foundational model BERT powers the search engine used by billions across the world. OpenAI’s GPT-3 is a powerful language model that has forayed into downstream tasks such as building low code platforms. In the era of such large scale foundational models that directly impact many real-world applications, what are the risks that tag along? To answer this, the entire AI department at Stanford University has released a survey.
In this report, the researchers have provided a thorough account of the opportunities and risks of foundation models, their capabilities, applications, and societal impact.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Foundation models are intermediary assets that are not directly deployed but rather serve as a further adapted foundation. As a result, traditional approaches to reasoning about the societal impact of technology are likely complicated. In the report, the researchers focused on two attributes specific to foundational models: emergence and homogenisation.
These two terms summarise the significance of foundation models. According to the researchers, emergence relates to behaviour within a system that is “implicitly induced”, whereas homogenisation relates to different methodologies working together to make a wide range of applications run it provides strong leverage towards many tasks but also creates single points of failure.
Compared to most other machine learning models, foundation models are characterised by a vast increase in training data and complexity and the emergence of unforeseen capabilities: foundation models can do unforeseen tasks and do these tasks in unforeseen ways. The increasing adoption of foundation models creates growing desires, demands, and unprecedented challenges for understanding their behaviour. In trying to understand the working of GPT-3, the researchers have also explored the possibility of GPT-3 behaviour with something as simple as a mathematical operation of addition.
“Delving into the model, we may envision a deeper understanding of the mechanisms that GPT-3 uses to add a specific pair of numbers and the mechanism that it uses to add other arbitrary pairs of numbers. We may also envision a deeper understanding of whether these mechanisms are similar to the mathematical notion of’ addition’ or merely correlated with this notion,” wrote the researchers.
Talking to Analytics India Magazine, Sahar Mor, one of the earliest makers of GPT-3 based apps, said the performance of GPT-3 is general and human-like. Moreover, OpenAI has even acknowledged the destructive potential of its technology. This makes the regulation of foundational models even more challenging. Most of the foundational models are products of the big tech. This raises the question of whether one can trust a commercial company to self-regulate when it is forced to choose between ethics and revenues.
According to Mor, the main challenge is to understand how and if at all, regulation is an effective tool in ensuring safe AI adoption. Human intelligence, says Mor, works in a multi-modal manner, where we utilise all of our senses when making decisions such as “What is in this picture?” or “Is this a toxic comment?”. Making such decisions requires incorporating other elements such as past experiences, which are de-facto the equivalent of transfer learning in ML. “Not incorporating the two is confining whatever ML model you’re building to its (missing) data, and if the saying “your model is only as good as the data it was trained on” is a popular one, then how about “your model is only as good as the completeness of the data it was trained on?” asks Mor.
The Stanford researchers posit that foundation models’ intrinsic properties and biases can be passed down to downstream applications. These biases, wrote the researchers, can originate from training data, modelling and adaptation decisions, modeller diversity, and community values. Thus, identifying the sources is critical to developing interventions.
Although data and foundation models are diverse in their applications, data participants should be able to indicate how they do not want to have their data used. An opt-out consent model favours developers, as it does not require them to get consent for each new, unexpected use case. Important, then, is the right to revoke the consent given vacuously for applications that are now being pursued but were not when consent was originally given. For instance, researchers from Columbia University recently called for establishing an equivalent of the hippocratic oath within the field of AI and neurotechnology -– a “technocratic oath”.
Edward Abbey once famously said that growth for the sake of growth is the ideology of a cancer cell. He was referring to rapid urbanisation back then. One might sometimes wonder what really led to deepfakes and who stands to benefit from such an advancement. Such a line of thought can be an antithesis to the whole paradigm of scientific research. But, given the scale and significance of modern-day internet inventions, we might have to rethink how we apply knowledge.
The Stanford report, too, suggests that technologists should exercise their choice of when not to build, design, or deploy foundation models. “Developers and researchers should be cognizant of which problems they seek to address, e.g., how to scale up a foundation model versus how to make it more computationally accessible; how those problems are formulated; and who their solutions ultimately empower,” concluded the researchers.