How NLP Is Being Used To Identify Impact Of Pandemic On People’s Mental Health

How NLP Is Being Used To Identify Impact Of Pandemic On People’s Mental Health

Design by How NLP Is Being Used To Identify Impact Of Pandemic On People’s Mental Health

A recent study published by the researchers at MIT and Harvard University used Natural Language Processing (NLP) to monitor the impact of COVID-19 pandemic on people’s mental health.

Collating Reddit posts of more than 800,000 users from 2018 to 2020, the researchers used various NLP techniques like trend analysis, supervised learning and unsupervised learning to characterise changes in the language used by mental health support groups. 

The study performed classification among mental health subreddits (forums dedicated to a specific topic on Reddit) as well as non-mental health subreddits and identified important features that help understand how each mental health problem may manifest in language.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Trend Analysis

The trend analysis monitored COVID-19 related tokens (words, characters, or subwords) across subreddits and language features from January to April, to observe patterns. The researchers measured “how much the posts were about COVID-19” compared to the total number of words. 

The result showed an early spike in the ‘healthanxiety’ subreddit as pandemic-related posts showed up in January, even before any other subreddits were posting about a possible pandemic. This evidence supports concerns regarding the prolonged stress that people with health anxiety may be experiencing.

As the pandemic progressed, a significant decrease in the number of posts was observed, which could mean that users avoid social media due to a perceived increase in news and discussions around the pandemic. This can be concerning as individuals are not using online means to seek support during difficult times, said the study.

Throughout subreddits, the study found a significant increase in tokens related to isolation like ‘lonely’, ‘can’t see anyone’, ‘quarantine’ and economic stress like ‘debt’, ‘rent’. At the same time, it observed a decrease in lexicons related to motion like ‘walk’, ‘visit’, ‘travel’.

Another sign of concern was the increasing use of negative semantics across subreddits. This was predominantly higher in the attention-deficit/hyperactivity disorder (ADHD), eating disorder, anxiety and depression subreddits and other non-mental health-related subreddits like personal finance, relationship and finance groups. 

Unsupervised Analysis

For unsupervised learning, the study used clustering. It grouped posts into clusters such as ‘loneliness’ or ‘substance use’ and then tracked how those groups changed as the pandemic progressed. 

Natural clusters emerged related to ‘suicidality’ and ‘loneliness’ over the period. The number of posts in these clusters more than doubled during the pandemic as compared to the same months of the preceding year. The researchers also found the introduction of new topics specifically seeking ‘mental health help’ or ‘social interaction’.

The analysis also identified various cluster-subreddit pairings to identify which clusters enriched which subreddits. While the language mostly remained the same within clusters, pre and mid-pandemic, the variation was seen in terms of proximity of discussions to the respective subreddit. For instance, the ‘addiction’ subreddit was enriched with “Substance Use Alcohol” and “Substance Use Marijuana” clusters.

Supervised Learning

This analysis used Latent Dirichlet Allocation (LDA) to determine tokens that frequently appeared together in posts for a topic across mental health subreddits before the pandemic. Multiple models were created to assess the topic stability across and within subreddits, based on the token words of these topics.

While the topics that emerged from the pre-pandemic LDA largely matched the topics discussed before the pandemic, some of the tokens used for the topics discussed in subreddits saw a shift post-pandemic. For instance, autism and ADHD were always combined together into the same topic with tokens mostly related to school and work. The mid-pandemic LDA model split the tokens related to ADHD and autism.

The analysis also measured the similarity between two mental-health subreddits using LDA. Overall, it found that the more a subreddit discussed the COVID-19 pandemic, more it became similar to the ‘healthanxiety’ subreddit, which can suggest that symptoms of health anxiety might have increased in the psychiatric populations.

Wrapping Up

As COVID-19 pandemic impacts mental health, this type of analysis can help in identifying how people with different types of mental health problems have been impacted differently. Based on the study, one can infer the kind of changes that are seen in languages among different mental health patients when situations like pandemic arise. 

Going forward, if applied to Reddit, this study can help in identifying the right paths of treatment and make resources available accordingly.

Kashyap Raibagi
Kashyap currently works as a Tech Journalist at Analytics India Magazine (AIM). Reach out at

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox