MITB Banner

Using ‘Cocktail Party Problem’ to Talk with Animals

Understanding the language of animals and communicating with them is one of the longest-running fields of study in technology and biological sciences alike.

Share

Using ‘Cocktail Party Problem’ To Talk With Animals
Listen to this story

Animals communicating with each other might seem simplistic at first glance. Compared to human communication, animals do not appear to be using any particular language but merely noises to communicate with each other. Several noises that animals make are less of a conversation in the present, and more of a call for predicting natural changes such as rain, water, or signals for food some distance away.

When it comes to artificial intelligence, plenty of progress has been made in the development of AGI using machine learning and neural networks on animals and through the understanding of animal behaviour. However, understanding the language of animals and communicating with them is one of the longest-running fields of study in technology and biological sciences alike.

Recently, California-based organisation, Earth Species Project (ESP), introduced the Bioacoustic Cocktail Party Problem Network (BioCPPNet) that uses machine learning to decode non-human communication. The machine learning architecture is a modular, U-Net-based network that optimises bioacoustic source separation in diverse biological taxa. 

You can find the link to the code here.

What is Bioacoustic source separation?

Informally referred to as the “cocktail party problem”, Bioacoustic source separation encompasses the detecting, recognising, and extracting information problem from specific signals in the presence of noisy environments. While separating human speech is a well studied subject with the use of deep neural networks (DNNs), bioacoustic CPP in animal environments remains problematic due to an overlap of noises from different unidentifiable sources.

BioCPPNet, the machine learning model, is a lightweight neural network that acts as an end-to-end source separation system and extracts information from raw waveforms obtained directly from recordings to identify and reconstruct sources. Extracting information from a herd of animals is a difficult task. For example, 58% of vocal recordings of an African elephant consisted of two concurrent signals which were hard to separate. 

Aza Raskin, founder of ESP, said that the idea of BioCPPNet originates from the recent advancement in machine learning models that has made it possible to translate between distant human languages in real-time without any prior knowledge requirement. 

How does it work?

Recently, Elodie Briefer, associate professor at University of Copenhagen, has developed a pig-grunt analysing algorithm that helps in assessing positive or negative emotions in pigs. Though a great development, the algorithm only worked on pigs and failed to analyse other animals like dolphins, primates, or bees’ communication.

Raskin says that their network aims to understand the entire biodiverse ecosystem’s communication. The model was tested on macaques, Egyptian fruit bats, and bottlenose dolphins. The supervised model performed two tasks—to group sequences of input signals from different sources and to integrate simultaneous harmonic or quasi-harmonic sounds by a given signaller. 

ESP used a CNN-based classifier model in BioCPPNet to label the individual identity of vocalised signals. Using the data already available from previous studies, the aim was to apply a self-supervised machine learning algorithm to relate the physical animal behaviour and actions with the audio data to verify if they could be tied together. 

The algorithm worked best in a closed speaker regime with testing subjects drawn from the same distribution of training subsets. In case of bottlenose dolphins and macaques, the system struggled in an open speaker regime with the large testing data being different from smaller training data. However, in case of bats, the model yielded comparable results in both open and closed regimes suggesting the need for larger datasets.

Can we talk to animals?

Since the model has to be implemented on larger datasets of the animal environment, Raskin says that the method could benefit by reducing the supervised training scheme. This poses a limitation since the models worked best with larger training data.

Raskin points to ongoing studies that are applying CNN and developing a self-supervised machine learning algorithm, without the requirement of human experts to label and input data. Christian Rutz, professor of biology at University of St Andrews said that Hawaiin crows, the species that makes and uses tools for foraging, are believed to have a more complex set of vocalisations than other crow species. Another study by Ari Friedlaender of University of California, uses data from sound recorders placed inside the ocean to observe behavioural patterns of marine animals.

Robert Seyfarth, professor of Psychology at University of Pennsylvania, points out the problem of inferring meaning from animal sounds. He argues that the same sound can have different meanings in different contexts when it comes to animals. “Applying AI analyses to human language, with which we are so intimately familiar, is one thing, but it can be different and difficult doing it to other species,” said Seyfarth.

Raskin acknowledges the concern and says that AI alone cannot unlock communication with other species but researches have showcased how complex animal languages are than merely noises and actions. This research opens the gate to previously unusable large datasets of overlapping signals and enables researchers to implement ML-based models to design management and conservation strategies for animal species.

Share
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.