Hadoop has completed a decade already!

happy-birthday_hadoopToday, the big data universe revolves around Hadoop, an open-source distributed platform to store and process data with high volume, velocity and variety. What started as an open source innovation has now changed the face of database technology & businesses in a short span of 10 years!

The explosion of big data in 2014 has led companies across industries such as advertising, retail, healthcare, social media, manufacturing, telecommunications and government to invest and adopt Hadoop.

Inspired by research papers published by Google, Hadoop was first seeded in October 2002 as ‘Nutch’ by Doug Cutting and Mike Cafarella. Named after a toy elephant owned by Cutting’s son, Hadoop has evolved from its stages of infantry as ‘Nutch’, to the phenomenal data analysis software that it is today.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

We celebrate Hadoop’s 10th birthday today, since Doug Cutting had officially started ‘Hadoop’ as a sub-project under ‘Nutch’ on 28th January 2006 after joining ‘Yahoo!’ in January 2006.




A year later, the Apache Software Foundation took it from Yahoo! and it developed into a vast framework. Cloudera was founded in August 2008 to commercialize Hadoop and in 2009, Dough Cutting joined Cloudera as chief architect.

In all these years, The Apache Hadoop framework developed Hadoop Distributed File System (HDFS), MapReduce, Hadoop YARN and Hadoop Common as it’s core modules. Besides these, Hadoop has a vast ecosystem of technologies, such as Apache Hbase, Apache Pig, Apache Spark, etc, and the numbers and scope of such projects continue to expand.

Companies use Hadoop to analyze customer behaviour, process call center activity, process data from social media and then make decisions in real time to understand customer needs, mitigate problems and gain a competitive advantage. It helps control big data processing costs by distributing computing across commodity servers instead of using expensive, specialized servers. Typically, there are several master nodes with hundreds and thousands of worker nodes, mitigating even a single failure possibility.

Moreover, the commercial ecosystem that has grown around Hadoop, has seen an even more astounding growth. In its early days, the open source community pushed it towards commercialization, as a result, it gained so much momentum that the open source community has to pull it back.

With the growth in the self-service data and internet of things, new Hadoop business models are emerging as competition continues to catalyze the transformation of Hadoop. The self-service data exploration gives the companies the ability to explore the width and depths of any data that comes in from machine logs, data centers, social media and can be harnessed to look for new insights and business processes. The ease of continuous access and processing of data in real time has made Hadoop an indispensable part of businesses today.

Apoorva Verma
As the Content Strategist for Analytics India magazine, Apoorva takes care of editing & writing articles, covering analytics news, taking interviews and managing the social media marketing.

Download our Mobile App

MachineHack

AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Strengthen Critical AI Skills with Trusted Corporate AI Training

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR