Last updated December 1, 2023
In AI Insights & Analysis

Apple’s Scary New Innovation Gives Voice to the Voiceless

Apple's latest innovation, Personal Voice, unveiled just before the International Day of Persons with Disabilities, marks a significant step in voice technology.

Share

Illustration by Nikhil Kumar

Published on December 1, 2023

by K L Krithika

Yesterday, Apple released a short film along with an e-book showing its new feature, Personal Voice for its devices which was unveiled earlier in May this year. This it did right before the International Day of Persons with Disabilities on December 3. Apple has always been at the forefront of making its devices accessibility friendly.

The company has consistently received higher ratings for their ease of use by people with visual, hearing, and motor impairments, as well as by the elderly. It is taking it a step ahead with AI, building features like the VoiceOver, Guided Access, Door Detection, Live Listen, Point and Speak for Magnifier etc.

For those at risk of speech loss, we’ve made it possible to preserve your voice on your devices so even if you can no longer speak, you can still sound like you. It’s remarkable to see the experiences this technology helps preserve, while also protecting your privacy. pic.twitter.com/Vir3VQbhOA
— Tim Cook (@tim_cook) November 30, 2023

Personal Voice was announced earlier this year, and Sarah Herrlinger, senior director, global accessibility policy & Initiatives said, “These groundbreaking features were designed with feedback from members of disability communities, to support a diverse set of users.” Although it is not speaking out about AI, it is quickly updating its features to integrate better technology.

Cloning voices for healthcare has been a work in progress. Previously, patients who lost their voice due to various illnesses had to use an electrolarynx. This device needs to be placed up the throat of the patient and the vibration causes it to generate a robotic sounding voice.

Companies that clone videos and images also clone voices which are used for not only all spaces of entertainment but also for healthcare. ElevenLabs, Murf.ai, Resemble AI, Respeecher etc create voice and video clones.

AI for Good

Using existing features, the Personal Voice add on enhances user experience further. The video released by the tech giant features Tristram Ingham, a physician, academic researcher, and disability community leader who suffers from facioscapulohumeral muscular dystrophy (FSHD). This disorder eventually leads to an inability to speak.

Speaking from experience, he said that, “Historically, providers have spoken for disabled people, families have spoken for disabled people. If technology can allow a voice to be preserved and maintained, that’s autonomy, that’s self-determination.” This is possible by a combination of text-to-speech (TTS) synthesis and machine learning (ML) to create a synthetic voice that sounds like the user’s own voice.

The user has to read out a series of randomly chosen text prompts aloud, providing a sample of their voice. Acoustic analysis of the voice sample extracts acoustic features such as pitch, timbre, and intonation. A text-to-speech model is trained on the user’s voice data and a large dataset of text and speech pairs. The model learns to associate acoustic features with corresponding text and generate synthetic speech that mimics the user’s voice.

All of this is done within the user’s phone without running the risk of privacy invasion a characteristic Apple is most known for. The created voice can be used for calls, FaceTime and other apps. This feature works with Live Speech which was also announced around the same time. You type what you want to say, and your Personal Voice says it out loud for you.

Can be used maliciously

This feature that will empower so many and give them a voice has simultaneously raised security and privacy concerns, given the increasing threat posed by deepfake technologies. The internet is full of stories of unsuspecting people and companies being scammed by voice clones, cleaning their bank accounts. Is it really wise to voluntarily give your voice recording to Apple?

The company in its announcement ensures that all data processing occurs locally on the device, which reduces the risk of data breaches. Access to Personal Voice generation and management is secured through biometric locks like FaceID or TouchID, and its use requires device unlocking, preventing unauthorized access. Personal Voices can be shared across devices linked to the same iCloud account and to third party apps, but there seems to be no way to transfer the voice to another device.

The possibility of further safeguards, such as tracking synthetic voices for detection, could be considered to enhance security. “Although I suspect, given the company’s privacy and security focus, it may already include this feature. It would be good if the means of detection were made publicly available,” writes Matt Smallman, author and security expert on the topic.

Vinod Iyengar, an AI Expert and Head of Product at Third AI is not so optimistic. “This could very quickly become a swamp of deep fakes everywhere,” he said. Voice cloning can be used to create fabricated audio content that appears to be genuine, making it harder to discern between real and fake audio recordings.

This could be another level of gray area and incoming legal troubles for the future.

In the meanwhile, speculations are on the rise on social media about Apple’s future direction, with suggestions that these features hint at more advanced AI integrations in upcoming products. The possibility of Apple surprising everyone with new local AI tools is discussed, indicating a shift from cloud-based to local data processing in AI. The future can have AI answer phone calls in your own voice without telling the difference!

More interesting is the fact that you can use this voice through Apple’s text to speech api.

In combination with an LLM/ChatGPT and cloud based phone number this could be used to allow an AI to make and receive calls on your behalf ( in your voice ).
— Kristoph (@ikristoph) May 17, 2023

Access all our open Survey & Awards Nomination forms in one place

K L Krithika

K L Krithika is a tech journalist at AIM. Apart from writing tech news, she enjoys reading sci-fi and pondering the impossible technologies, trying not to confuse it with reality.