Has the rise of containers and microservices challenged the dominance of Hadoop and other big data platform players? The shift to containers has clearly turned the virtualisation landscape on its head. Now, leading cloud majors are providing a range of big data services with or without Hadoop. By embracing containers, organisations are taking a crucial step toward reimagining themselves as agile digital enterprises capable of accelerating the delivery of innovative products, services, and customer experiences.
Effect On Enterprises
Now that containers and microservices technologies have become a part of enterprises, the usage of Spark and Hadoop has definitely been affected. Developers are keen to embrace the simplicity and agility of containers, and microservices are a foundation of the DevOps model. The architecture of a containerised environment allows programmers and DevOps to exponentially change the workflow of the creative side of data centres. Now that enterprises have made containers part of their architecture strategy and the container revolution has now been extended to big data applications as well.
Decoupling of services into smaller entities has been boosted by the shift to microservices and containerisation. Experts cite that as opposed to monolithic architecture wherein a failure in the code can impact a slew of functions, this effect is minimised in microservices. Also, containers abstract away major language constraints and library dependencies. So, developers are not dependent on JVM-specific tool and can deploy any library or just put it in a container.
Over the years, Hadoop garnered a considerable amount of attention for being slow and in a way, it also helped to push the discussions around a more efficient big data and analytics platform. In fact, leading big data giants Hortonworks and Cloudera (which have now merged) as well as MapR, operated as enterprise Hadoop distributions. Hadoop also paved the way for Apache Spark which made a significant impact on analytics and data-processing and in a way also helped fuel innovation in deep learning.
This led to the rise of Pachyderm, dubbed as the modern Hadoop. It is a new storage and analytics engine built on top of modern tools and one of its biggest advantages is that it also allows IT teams to leverage the advances in open source infrastructure, such as Docker and Kubernetes.
The shift to microservices, Kubernetes, containers and the cloud-native movement has amply demonstrated factors like agility, security, manageability and portability will help push decision-making in IT. According to Jeff Meyerson, Kubernetes has now become the de facto standard way of deploying new distributed applications.
The Shift To Containers Realigned Hadoop Platform Players
- Organisations are now actively investing in Kubernetes-as-a-service
- Cloud vendors are now rallying behind containers by building out Kubernetes product lines and businesses because there’s money to be made in the micro-service and the developer tool-space even if the enterprise is not a cloud-native thought leader
- Some of the key companies that are driving container platforms are making a chunk of revenue are Google, Microsoft, Red Hat and even IBM
- In an earlier article, we emphasised how a major rise in cloud computing, spanning storage, managed services and open source activity upended the Hadoop market
Is Kuberenetes The Future Of Big Data?
Kubernetes is pegged as one of the fastest-growing projects in open source and is believed to have surpassed the Hadoop ecosystem. It has also grown in terms of the number of vendors, projects and adoption by companies. Another great advantage of containers is that it allows for the development of an open, common layer for infrastructure, it becomes possible to avoid a vendor lock-in and allows enterprise IT team to follow a lift and shift approach from one cloud provider to another.