Data science has been playing a vital role in our fight against the outbreak of COVID-19. Data scientists have been assisting governments in assimilating trends and extracting relevant insights from them. However, we have witnessed a few challenges in this field, and those need to be addressed. In other words, we might see a few changes in the data science landscape after the crisis is over, and we must be prepared for that.
Advancement In Streaming Analytics
Advanced visualization has helped governments and researchers closely watch day-to-day developments of COVID-19 and make decisions effectively. Not just descriptive statistics, but analyzing correlation among various factors are allowing decision-makers to compare and understand the impact of the pandemic. Multiple organizations are processing a colossal amount of data and providing visualization to demonstrate how the virus got out.
However, such visualization came out after months, and by then, it was too late for the world to take action to effectively contain the virus. Nevertheless, those visualizations were helpful in the decision-making process to further reduce the impact of COVID-19. If we would have had such information earlier, it could have assisted international institutions, such as WHO, to declare an emergency in the very early stages.
Undoubtedly, today we can get COVID-19 data daily, but the information flow has only started in the last few weeks. Besides, real-time data is still a miss due to the lack of integration on electronic health record systems. Therefore, in future, organizations and governments should come together to build an ecosystem that will help the data science community bring more value with real-time insights.
Democratization Of Medical Health Records
AI quickly took the central stage in various sectors, such as finance and media, but was slow in penetrating the healthcare sector due to concerns about misuse of patient health record data. Besides, another reason for keeping AI at bay in healthcare data is that any inaccurate prediction can lead a doctor to suggest fallacious treatment that can directly afflict the patient.
While these reasons are not unfounded, the contribution of data science in the current pandemic has demonstrated how beneficial it could be to leverage patient data as well. This may help experts and decision-makers build frameworks through various policies to allow data science to develop medicines and other healthcare offerings. AI has made a few breakthroughs with disease prediction and drug discovery, but researchers are highly critical of its use due to the shortage of diverse data.
Cloud has been driving businesses to scale quickly while cutting operational costs. But, some companies have been critical of migrating every business operations on cloud due to privacy reasons. And most importantly, data science departments usually stick to on-premise infrastructure for their mission-critical projects. Such projects have now taken a hit due to the lockdown of cities. This has prompted companies to move all active projects to the cloud to bring flexibility in collaboratively working from remote locations.
Robust Natural Language Processing Solutions
Fake news and conspiracy theories about COVID-19 have brought a lot of confusion among people and caused hindrance for governments to ensure that people comply with their lockdown initiatives. This is not new as fake news was abounded on social media and popular chat applications.
Various measures have been taken by companies like WhatsApp to limit users’ capability of sharing on the go, and Youtube has reduced the recommendation of conspiracy theories as well. But these have not eliminated fake news from social media platforms due to the strenuous nature of natural language processing. Researchers might be able to extensively focus on building solutions to precisely identify fake news without human help.
Data scientists are predicting the spread of COVID-19 along with information on the number of lives it is likely to affect. However, forecasting with incomplete data can further confuse people during these challenging times. “Existing datasets (on COVID-19) are incredibly biased. For instance, while calculating the mortality rate, normally developers look at the deaths per confirmed case. But, the assumption is that we have captured all of the confirmed cases, which is not valid.
We are bottlenecked by the number of tests, and only the sickest are diagnosed,” according to Neil Cheng, Senior Data Scientist at Akamai. The community will have to think beyond just prediction and restrain themselves from using any data they get without being critical of the bias in the information that is being collected.