How to Train a Culturally Inclusive LLM, Explains Amazon CTO

In the generative AI space, there is a cultural shift happening, Werner Vogels rightly believes.

Share

Published on December 6, 2023

by Tasmia Ansari

Listen to this story

In 1998, when Werner Vogels joined Amazon, the company had a single US-based website selling only books. He undertook the responsibility to change that. “I want you to realise that first and foremost Amazon is a technology company,” the company’s CTO declared during an interview in 2006.

The man wasn’t off the mark.

Amazon has covered a massive journey from a bookstore to the behemoth of cloud infrastructure, boasting a clientele of over 1.45 million businesses today. Vogels has played an important role in steering the platform from a run-of-the-mill online shop to a service-oriented architecture.

As the chief tech of one of biggest tech goliaths in the world, Vogels has had the ringside view of how technology evolved over the years. Being an important part of the industry, every year Vogels makes a set of predictions. At re:Invent, we asked Vogels’ opinion on how the scene is set to culturally change with large language models — the latest tech cacophony in the Silicon Valley.

While these language models managed to woo the internet, many have questioned their ability to chit chat in local languages. Developing these AI chatterboxes in non-English languages comes at a high cost due to the high count of tokenisation cost. “I am not necessarily convinced that 20 billion tokens are much better than 5 billion,” Vogels opined.

Drilling down on the rationale behind the downsizing of language models, he cites recent research, including a nugget from Stanford, suggesting that smaller models might match the prowess of their massive counterparts in text generation.

Translation is Not the Issue

A cultural shift happening in the generative AI space, Vogels believes. Since GPT-4 released earlier this year, a lot of concerns have been raised by researchers about the model’s lack of inclusivity and diversity. Now that we have established an array of models in different parameters and sizes, cultural inclusivity should ideally be the next one to crack on the LLM to-do list.

On a similar line, the first prediction for 2024 he made was during the traditional keynote. He said that generative AI will become culturally aware, meaning that models will gain a better understanding of different cultures and traditions. While addressing the audience, he also stated that if companies want to deploy these genAI tools across the world, they have to start thinking about how to make their models more culturally aware.

“If we don’t solve it, it will be a massive hindrance for deploying this technology worldwide because it’s not just about language, it’s about all the cultural aspects, which are meaningful to us as humans,” he said.

Speaking about the problems in developing culturally inclusive models, he explained that, “I can talk in the language of Kerala and someone across the phone can listen to it in any other language. Real time translation is not the issue, it is the additional cultural embedment that sits in that language”.

How To Train Your Model

The 65-year-old seasoned Amazon insider also had some thoughts on streamlining the model-building process. “There’s lots of efficiencies that we can introduce in building models and incrementally improving them,” he shared.

“Even if you have five different language models, from different cultures, the history before 1950 is probably going to remain all the same. You don’t need 150 massive language models with the same historical data,” he tossed in.

Keeping it practical, he concluded by suggesting that, ”We can have a few bigger ones which can collaborate as they have the same base. Then one can monitor what’s the best answer to give as per a specific context”.

Access all our open Survey & Awards Nomination forms in one place

Tasmia Ansari

Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.