World Mental Health Day on 10th October cast a long-overdue spotlight on one of the most neglected areas of public health. Nearly a billion people have a mental disorder, and suicide occurs every 40 seconds. In developing countries, under 25% of people with mental, substance use or neurological disorders receive treatment. COVID-19 has worsened the crisis; with healthcare services disrupted, the hidden pandemic of mental ill-health remains largely unaddressed.
In this article, we share some perspectives on the role ML can play and an example of a real-life AI solution we built at Tiger Analytics to address a specific mental-health-related problem.
ML Is Already A Part of Physical Healthcare
Algorithms process Magnetic Resonance Imaging (MRI) scans. Clinical notes are parsed to pinpoint the onset of illnesses earlier than physicians can discern them. Cardiovascular disease and diabetes — two of the leading causes of death worldwide — are diagnosed using neural networks, decision trees and support vector machines. Clinical trials are monitored and assessed remotely to maintain physical distancing protocols.
These are ‘invasive’ approaches to automate what can — and usually is — being done by humans, but at speed and scale. In the field of mental health, ML can be applied in non-invasive, more humanistic ways that nudge physicians towards better treatment strategies.
Clinical Trials Of Mental Health Drugs
In clinical trials of mental health drugs, physicians and patients engage in detailed discussions of the patients’ mental state at each treatment stage. The efficacy of these drugs is determined using a combination of certain biomarkers, body vitals, and mental state as determined by the patient’s interaction with the physician.
The problem with the above approach is that an important input to determining drug efficacy is the responses of a person who has been going through mental health issues. To avoid errors, these interviews/interactions are recorded, and multiple experts listen to the long recordings to evaluate the quality of the interview and the conclusions made.
Two concerns arise — first, time and budget allow only a sample of interviews to be evaluated, which means there is an increased risk of fallacious conclusions regarding drug efficacy; and second, patients may not express all they are feeling in words. A multitude of emotions may be missed or misinterpreted, generating incorrect evaluation scores.
Tiger’s ML Models ‘Hear’ What’s Left Unsaid
Working with a pharmaceutical company, Tiger Analytics used speech analytics to identify ‘good’ interviews, i.e., ones that meet quality standards for inclusion in clinical trials, minimising the number of interviews that were excluded after evaluation, and saving time and expense.
As a data scientist, the typical challenges you face when working on a problem such as this are – What types of signal processing you can use to extract audio features? What non-audio features would be useful? How do you remove background noise in the interviews? How do you look for patterns in language? How do you solve for reviewers’ biases, inevitable in subjective events like interviews?
Below we walk you through the process the Tiger Analytics team used to develop the solution.
Step 1: Pre-processing
We removed background noise from the digital audio files and split them into alternating sections of speech and silence. We, then, grouped the speech sections into clusters, each cluster representing one speaker. We created a full transcript of the interview to enable language processing.
Step 2: Feature Extraction
We extracted several hundred features of the audio, from direct aspects like interview duration and voice amplitude to the more abstract speech rates, frequency-wise energy content and Mel-frequency cepstral coefficients (MFCCs). We used NLP to extract several features from the interview transcript. These captured the unique personal characteristics of individual speakers.
Beyond this, we captured features such as interview length, tone of the interviewer, any gender-related patterns, interview load on the physician, time of the day, and many more features.
Step 3: Prediction
We constructed an Interview Quality Score (IQS) representing the combination of several qualitative and quantitative aspects of each interview. We ensembled boosted trees, support vector machines, and random forests to segregate high-quality interviews from those with issues.
This model was able to effectively pre-screen about 75% of the interviews as good or bad and was unsure about the remainder. Reviewers could now work faster and more productively, focusing on only the interviews where the model was not too confident. Overall prediction accuracy improved 2.5x, with some segments returning over 90% accuracy.
Beyond Clinical Trials
The analyses provided independent insights regarding pauses, paralinguistics (tone of voice, loudness, inflexion, pitch), speech disfluency (fillers like ‘er’, ‘um’), and physician performance during such interviews.
These models have wider applicability beyond clinical trials. Physicians can use model insights to guide treatment and therapy, leading to better mental health outcomes for their patients, whether in clinical trials or practise, addressing one of the critical public health challenges of our time.