Workshop-Sarita-Priyadarshini-970x90

How NLP Is Being Used To Identify Impact Of Pandemic On People’s Mental Health

Share

How NLP Is Being Used To Identify Impact Of Pandemic On People’s Mental Health

Illustration by How NLP Is Being Used To Identify Impact Of Pandemic On People’s Mental Health

A recent study published by the researchers at MIT and Harvard University used Natural Language Processing (NLP) to monitor the impact of COVID-19 pandemic on people’s mental health.

Collating Reddit posts of more than 800,000 users from 2018 to 2020, the researchers used various NLP techniques like trend analysis, supervised learning and unsupervised learning to characterise changes in the language used by mental health support groups. 

The study performed classification among mental health subreddits (forums dedicated to a specific topic on Reddit) as well as non-mental health subreddits and identified important features that help understand how each mental health problem may manifest in language.

Trend Analysis

The trend analysis monitored COVID-19 related tokens (words, characters, or subwords) across subreddits and language features from January to April, to observe patterns. The researchers measured “how much the posts were about COVID-19” compared to the total number of words. 

The result showed an early spike in the ‘healthanxiety’ subreddit as pandemic-related posts showed up in January, even before any other subreddits were posting about a possible pandemic. This evidence supports concerns regarding the prolonged stress that people with health anxiety may be experiencing.

As the pandemic progressed, a significant decrease in the number of posts was observed, which could mean that users avoid social media due to a perceived increase in news and discussions around the pandemic. This can be concerning as individuals are not using online means to seek support during difficult times, said the study.

Throughout subreddits, the study found a significant increase in tokens related to isolation like ‘lonely’, ‘can’t see anyone’, ‘quarantine’ and economic stress like ‘debt’, ‘rent’. At the same time, it observed a decrease in lexicons related to motion like ‘walk’, ‘visit’, ‘travel’.

Another sign of concern was the increasing use of negative semantics across subreddits. This was predominantly higher in the attention-deficit/hyperactivity disorder (ADHD), eating disorder, anxiety and depression subreddits and other non-mental health-related subreddits like personal finance, relationship and finance groups. 

Unsupervised Analysis

For unsupervised learning, the study used clustering. It grouped posts into clusters such as ‘loneliness’ or ‘substance use’ and then tracked how those groups changed as the pandemic progressed. 

Natural clusters emerged related to ‘suicidality’ and ‘loneliness’ over the period. The number of posts in these clusters more than doubled during the pandemic as compared to the same months of the preceding year. The researchers also found the introduction of new topics specifically seeking ‘mental health help’ or ‘social interaction’.

The analysis also identified various cluster-subreddit pairings to identify which clusters enriched which subreddits. While the language mostly remained the same within clusters, pre and mid-pandemic, the variation was seen in terms of proximity of discussions to the respective subreddit. For instance, the ‘addiction’ subreddit was enriched with “Substance Use Alcohol” and “Substance Use Marijuana” clusters.

Supervised Learning

This analysis used Latent Dirichlet Allocation (LDA) to determine tokens that frequently appeared together in posts for a topic across mental health subreddits before the pandemic. Multiple models were created to assess the topic stability across and within subreddits, based on the token words of these topics.

While the topics that emerged from the pre-pandemic LDA largely matched the topics discussed before the pandemic, some of the tokens used for the topics discussed in subreddits saw a shift post-pandemic. For instance, autism and ADHD were always combined together into the same topic with tokens mostly related to school and work. The mid-pandemic LDA model split the tokens related to ADHD and autism.

The analysis also measured the similarity between two mental-health subreddits using LDA. Overall, it found that the more a subreddit discussed the COVID-19 pandemic, more it became similar to the ‘healthanxiety’ subreddit, which can suggest that symptoms of health anxiety might have increased in the psychiatric populations.

Wrapping Up

As COVID-19 pandemic impacts mental health, this type of analysis can help in identifying how people with different types of mental health problems have been impacted differently. Based on the study, one can infer the kind of changes that are seen in languages among different mental health patients when situations like pandemic arise. 

Going forward, if applied to Reddit, this study can help in identifying the right paths of treatment and make resources available accordingly.

Share
Picture of Kashyap Raibagi

Kashyap Raibagi

Kashyap currently works as a Tech Journalist at Analytics India Magazine (AIM). Reach out at kashyap.raibagi@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India