Core Technology Behind The Voice Tech Of Virtual Assistants

Image credit: Anna Godeassi

Virtual assistants are a part of our day-to-day lives. Exhibiting the power of technology, these assistants have the ability to entertain, assist and become your personal manager of your life. They do tasks from shopping and booking appointments to playing games. According to Activate, there will be an estimated 21.4 million smart speakers in the US alone by the year 2020.

Apart from all the tech giants of Google, Amazon and Apple, small startups are also using the power of AI to invent their own virtual voice assistants, as their demand is rising each day.

Technology Behind The Voice

Deep Neural Network (DNN):


Sign up for your weekly dose of what's up in emerging technology.

When an input voice is given to voice assistants, the basic thing which is done is that the voice is converted to text, analysed to come up with a reply in text and is then converted back to voice. As soon as the voice is transcribed to text using Natural Language Processing (NLP), it is analysed based on the dataset that it has. Depending on the questions, the virtual assistant is

As applications such as Apple Siri, Google Now, Microsoft Cortana, and Amazon Echo gain traction, web service companies are hoping to solve these challenges by using large deep neural networks (DNNs) to tackle huge, complex jobs. The researchers propose a new model called DjiNN, open infrastructure for DNN as a service in warehouse-scale computers, and Tonic Suite, a suite of 7 end-to-end applications that span image, speech, and language processing.

Hybrid Emotion Inference Model (HEIM):

Humans can very well get an idea of emotions through the tone of voices of others. An ML model called Hybrid Emotion Interference Model (HEIM) involves Latent Dirichlet Allocation (LDA) to extract text features and a Long Short-Term Memory (LSTM) to model the acoustic features, is deployed, to reveal the kind of emotions behind our voice.


NLP, the ability of machines to understand and learn from the languages that humans speak and write, has obviously been deployed in these assistants. But apart from NLP, a lesser known AI technology called the Natural Language Generation (NLG), which generates the text and speech using predefined data, is also put to use. At its most advanced, it powers the responses given by AI assistants, such as Google Home and Amazon’s Alexa, when asked a question.

Privacy Invasion?

Many people admire and are very exuberant about the upgrade of these voice assistants and by the number of real experiences that these voice technologies are competent of serving. It is indeed an occasion of amaze as to how AI has advanced so much that it is becoming more and more humane with each progressing day, but it should be more of a matter of concern. Some people are also concerned about how much AI can be given the power to invade into our privacy when it can very well make life-changing decisions for us without any explicit consent from us, and we might never even realise.

Voice tech usage in the US alone. Image source: Digiday UK.

Voice tech usage in the US alone. Image credits. Source: Kantar and Sonar, via Mindshare. Source: Digiday UK.

There are several incidences depicting such a privacy invasion. In 2018, a family in Portland reported Amazon as they found their Amazon Echo recording a private conversation and sending it to a random person in their contact list in Seattle. A Google spokesperson had in 2017 confirmed to CNN Tech that it had been spying due to a hardware flaw.

It is completely possible that these companies hear and record more than what they claim to, improve their machine learning. You could very well be impressed by the AI in these assistants but how willing should you be to be invaded by third parties for your comfort and convenience is all that you must be careful about. Because for artificial intelligence, and especially for assistants like these, the more personal data you give, the better services they will give.

Although experts say that these virtual assistants must do a better job at partitioning different types of information with different layers of security to prevent private information from being so easily shared, we cannot be sure of how much of this information is kept private.

More Great AIM Stories

Disha Misal
Found a way to Data Science and AI though her fascination for Technology. Likes to read, watch football and has an enourmous amount affection for Astrophysics.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM