ACM, the Association for Computing Machinery, has recently announced their ‘2019 ACM Prize in Computing’. And, this year they awarded David Silver for breakthrough advances in computer game-playing. David works at University College London as a professor and at DeepMind as a Principal Research Scientist, and has also been recognised as a critical individual in the area of deep reinforcement learning.
David Silver was known as a leader of the team that developed AlphaGo, a computer program that defeated the world champion of the game Go. He has also developed the AlphaGo algorithm by deftly combining ideas from deep-learning, reinforcement-learning, traditional tree-search and large-scale computing.
To initialise AlphaGo, it has been trained on expert human games followed by reinforcement learning in order to improve its performance. Subsequently, David Silver sought even more principled methods for achieving higher performance and generality. He developed the AlphaZero algorithm that learned entirely by playing games against itself, starting without any personal data or prior knowledge except the game rules. AlphaZero achieved superhuman performance in the games of chess, Shogi, and Go, demonstrating unprecedented generality of the game-playing methods.
The ACM Prize in Computing recognises early-to-mid-career computer scientists whose research contributions have fundamental impact and broad implications. The award carries a prize of $250,000, from an endowment provided by Infosys Ltd. Silver will formally receive the ACM Prize at ACM’s annual awards banquet on June 20, 2020, in San Francisco.
Computer Game-Playing and AI
AI researchers have been core in teaching computer programs to play games, where an agent is supposed to make a series of decisions using the process to win the game. Game-playing also affords researchers results that are easily quantifiable—that is, did the computer follow the rules, score points, and win the game?
The programs are developed in order to compete with humans at checkers, and over the decades, increasingly sophisticated chess programs were introduced. It was a turning point when ACM, in 1997, sponsored a tournament in which IBM’s DeepBlue became the first computer to defeat a world chess champion, Gary Kasparov. However, the main aim to develop programs was to use game-playing as a touchstone to create machines with capacities that simulated human intelligence.
According to ACM President Cherri M. Pancake, “Few other researchers have generated as much excitement in the AI field as David Silver.”
He further stated that the human vs machine contests have long been a yardstick for AI. Millions of people around the world watched as AlphaGo defeated the Go world champion, Lee Sedol, on television in March 2016. But that was just the beginning of Silver’s impact. His insights into deep reinforcement learning are already being applied in areas such as improving the efficiency of the UK’s power grid, reducing power consumption at Google’s data centres, and planning the trajectories of space probes for the European Space Agency.
Silver is credited with being one of the foremost proponents of a new machine learning tool called deep reinforcement learning. In deep reinforcement learning, the algorithm learns by trial-and-error in an interactive environment and continually adjusts its actions based on the information it accumulates while it is running.
Learning Atari from Scratch
In 2013, at the Neural Information Processing Systems Conference, David Silver and his team working at DeepMind presented a computer program that could play 50 Atari games to human-level ability. It was designed to learn playing games based solely on observing the pixels and scores while playing. Earlier reinforcement learning approaches had not achieved anything close to this level of ability.
The team published their method of combining reinforcement learning with artificial neural networks in a seminal 2015 paper. The team was also capable of refining these deep reinforcement learning algorithms with novel techniques, and these algorithms remain among the most widely-used tools in machine learning.
During his PhD, at the University of Alberta, David Silver first began exploring the possibility of developing a computer program that could master Go. His critical insight in developing AlphaGo was to combine deep neural networks with an algorithm used in computer game-playing called Monte Carlo Tree Search. The best way, while pursuing the perceived best strategy in a game, the algorithm is also continually investigating other alternatives. AlphaGo’s defeat of world Go champion Lee Sedol in March 2016 was hailed as a milestone moment in AI. Again the team published the foundational technology underpinning AlphaGo in the paper in 2016.
David Silver and his team at DeepMind continued their work on developing new algorithms to advance the computer program game-playing and achieved results many in the field thought were not yet possible for AI systems. In developing the AlphaGo Zero algorithm, Silver and his collaborators demonstrated that a program could master Go without any access to human expert games. The algorithm learns entirely by playing itself without any human data or prior knowledge, except the rules of the game and, in a further iteration, without even knowing the rules.
The DeepMind team continues to advance these technologies and find applications for them. Among other initiatives, Google is exploring how to use deep reinforcement learning approaches to manage robotic machinery at factories.