Core Technology Behind The Voice Tech Of Virtual Assistants

Design by Image credit: Anna Godeassi

Image credit: Anna Godeassi

Virtual assistants are a part of our day-to-day lives. Exhibiting the power of technology, these assistants have the ability to entertain, assist and become your personal manager of your life. They do tasks from shopping and booking appointments to playing games. According to Activate, there will be an estimated 21.4 million smart speakers in the US alone by the year 2020.

Apart from all the tech giants of Google, Amazon and Apple, small startups are also using the power of AI to invent their own virtual voice assistants, as their demand is rising each day.

Technology Behind The Voice

Deep Neural Network (DNN):

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

When an input voice is given to voice assistants, the basic thing which is done is that the voice is converted to text, analysed to come up with a reply in text and is then converted back to voice. As soon as the voice is transcribed to text using Natural Language Processing (NLP), it is analysed based on the dataset that it has. Depending on the questions, the virtual assistant is

As applications such as Apple Siri, Google Now, Microsoft Cortana, and Amazon Echo gain traction, web service companies are hoping to solve these challenges by using large deep neural networks (DNNs) to tackle huge, complex jobs. The researchers propose a new model called DjiNN, open infrastructure for DNN as a service in warehouse-scale computers, and Tonic Suite, a suite of 7 end-to-end applications that span image, speech, and language processing.

Hybrid Emotion Inference Model (HEIM):

Humans can very well get an idea of emotions through the tone of voices of others. An ML model called Hybrid Emotion Interference Model (HEIM) involves Latent Dirichlet Allocation (LDA) to extract text features and a Long Short-Term Memory (LSTM) to model the acoustic features, is deployed, to reveal the kind of emotions behind our voice.


NLP, the ability of machines to understand and learn from the languages that humans speak and write, has obviously been deployed in these assistants. But apart from NLP, a lesser known AI technology called the Natural Language Generation (NLG), which generates the text and speech using predefined data, is also put to use. At its most advanced, it powers the responses given by AI assistants, such as Google Home and Amazon’s Alexa, when asked a question.

Privacy Invasion?

Many people admire and are very exuberant about the upgrade of these voice assistants and by the number of real experiences that these voice technologies are competent of serving. It is indeed an occasion of amaze as to how AI has advanced so much that it is becoming more and more humane with each progressing day, but it should be more of a matter of concern. Some people are also concerned about how much AI can be given the power to invade into our privacy when it can very well make life-changing decisions for us without any explicit consent from us, and we might never even realise.

Voice tech usage in the US alone. Image source: Digiday UK.

Voice tech usage in the US alone. Image credits. Source: Kantar and Sonar, via Mindshare. Source: Digiday UK.

There are several incidences depicting such a privacy invasion. In 2018, a family in Portland reported Amazon as they found their Amazon Echo recording a private conversation and sending it to a random person in their contact list in Seattle. A Google spokesperson had in 2017 confirmed to CNN Tech that it had been spying due to a hardware flaw.

It is completely possible that these companies hear and record more than what they claim to, improve their machine learning. You could very well be impressed by the AI in these assistants but how willing should you be to be invaded by third parties for your comfort and convenience is all that you must be careful about. Because for artificial intelligence, and especially for assistants like these, the more personal data you give, the better services they will give.

Although experts say that these virtual assistants must do a better job at partitioning different types of information with different layers of security to prevent private information from being so easily shared, we cannot be sure of how much of this information is kept private.

Disha Misal
Found a way to Data Science and AI though her fascination for Technology. Likes to read, watch football and has an enourmous amount affection for Astrophysics.

Download our Mobile App


AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry


Strengthen Critical AI Skills with Trusted Corporate AI Training

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox