Yoshua Bengio is recognised as one of the world’s leading experts in artificial intelligence and a pioneer in deep learning. Following his studies in Montreal, culminating in a PhD in computer science from McGill University in 1991, Professor Bengio did postdoctoral studies at the Massachusetts Institute of Technology (MIT) in Boston.
In 2019, he was awarded the Killam Prize as well as the 2018 Turing Award, considered to be the Nobel prize for computing. These honours reflect the profound influence of his work on the evolution of our society.
Yoshua Bengio is also known for collecting the largest number of new citations in the world in the year 2018. Here are a few of his works, which have pushed the boundaries of AI:
Learning Long-Term Dependencies With Gradient Descent Is Difficult
Cited by: 3896 | Published in 1994
This work by Bengio and his colleagues is a testimony to all the accolades he has garnered over the years. This paper is an extraordinary treatise into the practical shortcomings of Recurrent Neural Networks(RNNs). RNNs were barely popular in the early 90s and Bengio already had discussed in detail why gradient based algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases.
Today, RNNs are popular in the form of LSTMs. From speech assistants to handwriting recognition to music compositions, one cannot ignore their presence.
Read the original paper here.
Convolutional Networks For Images, Speech, And Time Series
Cited by: 2433 | Published in 1995
In this seminal paper, Bengio collaborated with Lecun to uncover the reach of CNNs. Today, manu machine vision tasks are flooded with CNNs. They are the workhorses of autonomous driving vehicles and even screen locks on mobiles.
This work discusses about the variants of CNNs addressing the innovations of Geoff Hinton and Yann Lecun while also indicating how easy it is to implement CNNs on hardware devices dedicated to image processing tasks.
Read the original paper here.
Gradient based Learning Applied To Document Recognition
Cited by: 20630 | Published in 1998
The main message of this paper is that better pattern recognition systems can be built by relying more on automatic learning and less on hand designed heuristics.
Yoshua Bengio along with fellow Turing award winner Yann Lecun, demonstrate that show that the traditional way of building recognition systems by manually integrating individually designed modules can be replaced by a well principled design paradigm called Graph Transformer Networks that allows training all the modules to optimise a global performance criterion.
Read the original paper here.
Learning Deep Architectures For AI
Cited by: 7070 | Published in 2009
This paper discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.
This work is a detailed report on the then state-of-the-art architectures. This report poses open questions to the shortcomings of few architectures while also suggesting new avenues for optimising deep architectures, either by tracking solutions along a regularisation path, or by presenting the system with a sequence of selected examples illustrating gradually more complicated concepts, in a way analogous to the way students or animals are trained.
Read the original paper here.
Neural Machine Translation by Jointly Learning To Align And Translate
Cited by: 8231 | Published in 2014
In this new approach, the authors achieved a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation.
This work addressed the drawbacks of traditional encoder-decoder approach and allows the model to focus only on information relevant to the generation of the next target word instead of having to encode a whole source sentence into a fixed-length vector.
This paper led to better machine translation models and a better understanding of natural languages in general.
Read the original paper here.
Along with the above works, Bengio has also with other industry giants like Ian Goodfellow and has produced exemplary works that are one of the most referred sources for deep learning.
Check all the works of Bengio here.