In 2006, popular streaming service Netflix launched a $1 million competition, inviting researchers, students and mathematicians to take a shot at improving the Netflix recommendation algorithm. The rules for the competition were to build an algorithm that beats Cinematch, Netflix’s baseline algorithm by 10 percent. Today, many people question the usefulness of this competition and open algorithmic contests as a whole.
The Netflix Prize
Former Engineering Director at Netflix Xavier Amatriain believes that back in 2006, there was no Kaggle, and open-source when the Netflix Prize was introduced. And Artificial intelligence (AI) was barely as relevant as they are today. Considering this, the Netflix Prize must have seemed a brilliant opportunity for many programmers.
The training set consisted of approximately 100 million data points, including a user, a movie, a date and a rating from one to five stars. Participants also had access to a smaller public validation dataset known as ‘‘probe’’, which had around 1.5 million data points without any ratings. Finally, participants had two test sets that hid ratings from them. Thus, to test one’s algorithm, they would have to submit their predictions to a quiz test, after which they would get back the accuracy measured in a Root Mean Squared Error (RSME). However, the yearly progress prizes and the grand prize were measured against a different test set.
After around three years of intense collaboration, BellKor’s Pragmatic Chaos won the $1 million grand prize. The team was hybrid— consisting of KorBell (a group of researchers from telecommunications company AT&T, who also won the first Progress Prize in 2007—a milestone in the competition), Austrian team Big Chaos, and Quebecois team Pragmatic Theory. The three groups joined together to improve upon their scores and ultimately pass the required 10 percent mark.
Korbell’s Progress Prize Solution used an ensemble of a variation of the singular value decomposition (SVD) and restricted Boltzmann machines (RBM). The SVD had an RMSE of 0.8914, the RBM 0.8990 and the two in linear blend brought the RMSE to 0.88 (0.8572 was required on the test set to win the grand prize). The winning entry, which the hybrid team developed three years later, achieved this with an ensemble of 104 individual predictor sets created by multiple groups and put together by a single layer neural network. The winning algorithm bested Netflix’s benchmark by a little over 10 percent.
Was this useful?
Yes, and no.
According to Amatriain, the 2007 Progress Prize solution was already substantially improved over Netflix’s existing Cinematch algorithm. Thus, Netflix put together a team to productionise this algorithm, including rewriting the code, making it scalable, and retraining incrementally as new ratings came in. However, the grand prize (with 104 algorithms) was ultimately deemed too large an engineering effort to productionise for what Amatriain claims would be “A small gain in accuracy that was most likely not worth it.” A key reason for this was the shift to streaming at the expense of DVD-by-mail, which made predicting consumption more essential than predicting ratings.
Despite this, Amatriain says it would be incorrect to conclude that Netflix’s investment in the competition was not worth the million dollars. For one, he says that hiring a single Silicon Valley engineer for three years would have cost Netflix much more than $1 million. Furthermore, Netflix also got many engineers and researchers conversing and thinking about its problem and jump-started its innovation in a time even before Kaggle and the openness around open ecosystem AI.
Given this viewpoint, some individuals find ML innovation powered by external teams competing for cash a little exploitative. Some people in the field have gone on to call it using “Unpaid labour to make models you can’t productionise.”
Remember when companies thought their ML innovation would come from external teams competing for cash to build the most accurate model on a test dataset? I'm glad we all learned that using unpaid labor to make models you can't productionize is bad.— Dr. Jacqueline Nolis (@skyetetra) June 16, 2021
WE ALL LEARNED THAT RIGHT?
Despite this, AI competitions do have the potential to develop better solutions. For instance, Facebook recently launched the NetHack challenge, asking participants to build and train AI systems that can either reliably beat the game or achieve as high a score as possible. NetHack utilises simple ASCII graphics and is primarily written in C, but it is labelled one of the world’s most complicated games. Such games, and thus, such challenges, allow developers to extract realistic-looking data sets to train their algorithms where developers may not find high-quality, real-world data. Additionally, games like NetHack utilise Reinforcement Learning (RL), which is vital for traffic control systems and autonomous vehicles’ development, among other things. Thus, sometimes, such challenges can help bring in new solutions whilst substantially lowering costs for developers.
Another helpful competition could be Andrew Ng’s data-centric AI competition, which he hopes would change the decades-old model-centric approach held by machine learning developers. Data is fundamental to machine learning, but smaller datasets face noisier data while larger ones are difficult to label. Thus, Andrew Ng’s competition might allow developers to learn how to employ their data to make more efficient systems optimally. Still, it is essential to highlight that such competitions should not be a way to outsource a company’s problems for cheaper solutions.
Returning to Netflix’s case, Amatriain claims that he, along with many other people, may never have begun working at Netflix if it were not for the Netflix Prize; and Netflix would have hardly innovated at the pace at which it currently operates. Still, Amatriain also argues that algorithmic contests cannot be the sole priority of a company. Additionally, it is vital to have organisations in charge of such competitions to ensure participants have the right incentives and benefit from fair rules.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
I am an economics undergrad who loves drinking coffee and writing about technology and finance. I like to play the ukulele and watch old movies when I'm free.