Together with Yann Lecun and Yoshua Bengio, Geoffery Hinton is referred to as the Godfather of Deep Learning. Hinton is most famously credited for the invention of backpropagation. Author Cade Metz writes in his book Genius Makers that Lecun first developed his idea on convolutional neural networks (CNNs) with Hinton during his time in Toronto before the former moved to Bell Labs to give a definitive shape to his idea.
Though widely popular for computer vision applications, CNNs suffers from a fundamental problem – it lacks the creativity of the human mind that it aims to recreate. As mentioned by Hinton during a talk, neural networks should be able to ‘generalise effortlessly’. “If they learned to recognise something, and you make it ten times as big, and you rotate it 60 degrees, it shouldn’t cause them any problem at all. We know computer graphics is like that, and we’d like to make neural nets more like that,” he said.
CNNs have so far failed to achieve that. To remediate this problem, in 2017, Hinton and his team introduced Capsule Neural Networks. In 2020, Google filed a patent on it. In this article, we attempt to trace the development of this new neural network type over the course of five years.
What is Capsule Neural Network
Even as one of the earliest proponents of neural networks, Hinton has always cautioned about this idea having its own limitations. To this end, Hinton and his team introduced an alternative mathematical model called the capsule neural network. Capsule neural network solves this problem by looking at the world in three dimensions. In an earlier interview, he said that capsules were a way on how one does visual perceptions using reconstruction and routing the information to the right places. “In standard neural nets, the information, the activity in the layer, just automatically go somewhere; you don’t decide where to send it. The idea of capsules was to make decisions about where to send information,” he said.
Google filed a patent claiming that capsules can be used in the place of conventional CNNs. The former mainly contribute in three ways – vector outputs, squash function and routing.
This model fetches spatial information and other important features to overcome the loss of information that is seen with pooling in CNNs. Capsules give vector (with direction) as an output. For example, if the orientation of the image is changed, the vector will be moved in the same direction. This is an important feature for the classification of objects for computer vision tasks. For example, a cat has more chances of being classified as a cat based on the position of its whisker, that too in fewer steps, as compared to CNN.
Applications of capsule neural networks
Since its introduction, capsule neural networks have found applications in a few areas. In the 2021 Nature paper, researchers Vittorio Mazzia, Francesco Salvetti and Marcello Chiaberge introduced Efficient-CapsNet, a capsule neural network architecture with 160k parameters. The proposed architecture was able to achieve state-of-the-art results even with just 2 per cent of the original capsule neural network parameters. The researchers were able to prove the effectiveness of their methodology and the capability of capsule networks to embed visual representations that are more prone to generalisation.
In a paper titled “Capsule Neural Network-based Height Classification using Low-Cost Automotive Ultrasonic Sensors”, researchers demonstrated a capsule neural network that could provide a detailed height analysis of detected objects. The team then applied re-sorting and re-shaping methods. This method was able to achieve a validation accuracy of 99 per cent with a runtime of 0.2 ms.
Other major research works include capsule neural networks for sentimental analysis, useful life estimations, and biometric recognition systems.
Has the capsule neural network failed to take off?
Despite being around for some time now, we do not get to see a lot about capsule neural networks. Compare this with Transformers, which since their introduction around the same time, have been actively used for various applications, including large language models and computer vision. In a popular ML forum discussion (started in 2019), the Google Brain said that capsule neural networks weren’t “quite dead yet”. The team added that there was still a long way to go, but it needs to be first scaled up for real-world problems and become a standard in the machine learning toolbox.
Hinton also said in a 2019 interview, “Now, since I started working on capsules, some other very smart people at Google invented Transformers, which are doing the same thing. They’re deciding where to route information, and that’s a big win.”
Clearly, the transformer has won the popularity contest here. Not considering the efficiency and performance of the capsule neural network, the model has remained under shadows, at least until now. That said, some of the greatest inventions have taken off only after being neglected for a long period of time. This could prove true for capsule neural networks too.