Listen to this story
|
OpenAI announced its latest text-to-speech feature where ChatGPT can now read your texts aloud. A user can select a particular text and choose the option of ‘read aloud’ to hear ChatGPT read the response. This comes in handy when ChatGPT responds with lengthy prompts. The said feature is released on iOS, Android and even on the web version.
Source: X
Voice Modality Gains Speed
Akin to Google’s text-to-speech feature in Google Voice Assistant, which is said to soon be powered by generative AI, OpenAI’s latest voice feature adds to the voice functionality that was announced last year. The previous feature allows one to have a hands-free conversation with the AI bot, and even allows the user to select the type of voice.
This update comes at a time when voice features are increasingly being integrated onto multimodal AI models. Apart from ChatGPT, the latest powerful model of Google i.e. Gemini is also multimodal. AI startup Pika Labs is experimenting with audio features on AI-generated videos.
Pika was not the only one. Alibaba’s EMO AI generator (Emote Portrait Alive) that generates expressive portrait videos with audio2video diffusion models was recently released. The company released videos where images were made to talk/sing with expressive facial gestures.
Not a Great Reception
OpenAI’s latest voice announcement wasn’t too exciting for the users, considering how Anthropic launched their latest model Claude-3 which is said to surpass GPT-4 on a number of parameters.
Source: X
Mocking OpenAI’s new product announcement, users are asking for GPT-5. It is likely that OpenAI is gearing up for something bigger.