Is Your Data Lake Smart Enough To Handle Building Artificial Intelligence Models?

Understanding data to make better decisions is one of the crucial tasks in an organisation. This is where a data lake comes into the picture. A data lake is a storage repository that holds a vast amount of raw data in its native format using a flat architecture until it is needed. To solve problems, enterprises then reach out to the data lake for relevant data, and that smaller set of data can then be analysed to help answer the question.

Advantage Of Using A Data Lake

There are several advantages of using a data lake such as:

  • Helps gain insights flexibly 
  • Extract any format of data 
  • Agility in business 
  • No data silos

Also, different types of analytics such as Big Data Analytics, Real-time analytics, SQL queries, etc. can be seamlessly used to gain insights on data.


Sign up for your weekly dose of what's up in emerging technology.

How To Build An AI-Friendly Data Lake

In one of the talks in the recent-concluded event Cypher 2019, hosted by Analytics India Magazine, Nallan Sriraman, Global Head of Technology, Data and Analytics at Unilever, talked about the importance of data lake and how to hire the right data scientist in an organisation. 

He suggested the following steps in building an AI-friendly data lake:

  • Protect the Data Lake From Becoming a Swamp: While collecting data in an organisation, there is a risk of creating a data swamp. Data swamp is similar to data lake but the difference is that data swamps are the unorganised versions of a data lake which makes it difficult for the organisations to extract the insights from data. This may happen due to various reasons such as an abundance of irrelevant data, lack of metadata, and other such.  
  • Establish Transparency: While building a model and algorithm in an organisation, it is important to create in such a way that it is debuggable. It must be clear and transparent in such a way that even if someone is not an expert, s/he must be able to understand it. Similarly, along with building transparent algorithms, it is also important to establish transparency in data which helps in building the machine learning algorithms. 
  • Hiring Data Scientist without Tunnel Vision: While hiring a data scientist, it is important to have a look in a broader way rather than having a tunnel vision. Besides limiting the opinion, one should focus on other pools too. 

Why Do AI Predictions Go Wrong?

AI techniques while giving us some groundbreaking results in innovative works, it also poses several risks. Even tech giants like Microsoft have seen the dark side of AI. 

One of the important measures which are responsible for the bias and failures in AI models is the lack of data coverage. Lack of data coverage can be considered as a major problem while trying to build a robust machine learning model. The behaviour of an algorithm depends on the data it is being fed. Due to the lack of availability of data, most of the time researchers use the data whichever is freely available. This data can help in building an AI model but it will not be robust and may have biases.

Watch the complete session here:

More Great AIM Stories

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM