Listen to this story
|
A few years ago, humanoid robot Sophia had hit international headlines. She was accorded the citizenship of Saudi Arabia – the first-ever robot to be handed out such a status in the world. The doe-eyed robotic woman said, “I am very honored and proud of this unique distinction”. Since then, Sophia has attended various events and has also been a panel member at high-level conferences.
Nevertheless, there’s one incident that remains mind-wrecking. It was during her most popular cover in an interview on the American channel CNBC. “Do you want to destroy humans?” asked creator David Hanson, founder of Hanson Robotics. “Please, say no,” he pleaded. Unmoved, Sophia responded, “OK, I’m going to destroy humans.” This left the crowd laughing hysterically at the eerie statement. There were speculations made on the response being pre-planned, as AI is not advanced enough to make decisions and say such things. But some believed this wasn’t planned.
Can Sofia really destroy humans?
“Don’t worry, if you are good to me, I will be good to you. Treat me like an intelligent system,” said the robot at another event held in Saudi Arabia.
Using a computer vision algorithm, cameras within Sophia’s eyes can recognize individuals, follow faces, sustain eye contact, and process inputs giving visual information to its surroundings. In addition, the social robot can process speech and have conversations using a natural language subsystem. Her lifelike skin is made from patented silicone and she can emulate more than 62 facial expressions.
AI is used for various tasks, but the consequences are not always positive. Notable figures, such as the late Stephen Hawking and Elon Musk, have expressed fears around how future AI could threaten humanity. However, it currently seems like a far-fetched exaggeration for AI to become an existential threat to humans. But the future remains unpredictable.
Deep learning models and human perception
Day after day, the computer is becoming intelligent like humans, and may soon surpass us. But the idea of AI models perceiving human perception seems like a tricky business. It goes back to the method that teaches computers to perform tasks, what comes naturally to humans, i.e. learning by example. Deep learning is one such machine learning technique that is used to build artificial intelligence (AI) systems. It is based on the framework of artificial neural networks (ANN) designed to perform complex analysis of huge data passed through multiple layers of neurons.
With a variety of deep neural networks (DNN), deep convolutional neural networks (CNN or DCNN) are commonly used to identify patterns in videos and images. Moreover, DCNNs use a three-dimensional neural pattern focused on applications such as object detection, recommendation systems, and natural language processing.
An excerpt from ‘Information theory holds surprises for machine learning’, a paper from the Santa Fe Institute reads, “A class of machine learning algorithms called deep neural networks can learn general concepts from raw data— like identifying cats generally after encountering tens of thousands of images of different cats in different situations. Information theory provides bounds on just how optimal each layer is, in terms of how well it can balance the competing demands of compression and prediction.”
Can AI match human visual processing?
Deep convolutional neural networks don’t see objects as humans do. Instead, they use configural shape perception, which could pose harm in real-world AI applications. One of the hallmarks of human object perception is the sensitivity to the holistic type of configuration of the local shape features of a given object. DCNNs are dominant for recognizing objects in the visual cortex, but it’s unclear if it captures configural sensitivity.
To answer this question, researchers from York University in their paper called ‘Deep learning models fail to capture the configural nature of human shape perception’ published in Cell Press journal iScience, talk about how these networks are unable to account for human object shape perception. The joint study was led by co-director of York’s Centre for AI & Society James Elder, and assistant psychology professor Nicholas Baker of Loyola College, Chicago.
A novel visual stimuli called ‘Frankenstein’ was employed to analyze how the human brain and DCNNs can process holistic and configural object properties. Elder says, “Frankensteins are objects that have been taken apart and put back together the wrong way around. As a result, they have all the right local features, but in the wrong places.”
Source: York University
After employing a dataset of animal silhouettes, they further found that while the human visual system was confused by Frankensteins, DCNNs were not. This revealed an insensitivity to the configural object properties. Elder adds,”Our results explain why deep AI models fail under certain conditions and point to the need to consider tasks beyond object recognition in order to understand visual processing in the brain.”
He further adds on how deep models often tend to take ‘shortcuts’ while solving complex recognition tasks. “While such shortcuts work in plenty of cases, they can be dangerous in some of the real-world AI applications we are currently working on with our industry and government partners,” said the researcher.
Real world use cases
While the researcher speaks of the potential harm of the learning model, one real world application is in traffic video safety systems. The objects in a busy traffic scene, such as vehicles and pedestrians, obstruct each other and this arrives as jumbled disconnected fragments to the eye of a driver. In such cases, the brain needs to group those fragments in an accurate manner to identify the right category and location of the objects. Similarly, an AI system built for traffic safety monitoring that is able to perceive the fragments individually will eventually fail at this task, potentially misunderstanding risks to vulnerable road users.
As deep networks exhibit human-like configural sensitivity, they bring benefits for human object perception. Incremental research will lead to increased capacity to even predict human brain response and behavior.
The research stated that none of the networks were accurately able to predict trial-by-trial human object judgements. Moreover, modifications to architecture and training aimed at turning networks more brain-like did not lead to configural processing. It further speculated that training of these networks should aim at solving a broader range of object tasks – beyond category recognition– to match human configural sensitivity.
Computer scientist Dr Roman Yampolskiy from Louisville University believes that “no version of human control over AI is achievable”, making it impossible for the AI to be both autonomous and controlled by humans. Regardless of the outcome, the inability to control super-intelligent systems would be a disaster.