Last updated September 20, 2020
In Tech & AI Blend

Qubole – Next Generation Cloud Data Platform

Published on July 22, 2012
by Bhasker Gupta

With the proliferation of applications and end devices – web, mobile, sensors etc. – the last few years and the foreseeable future promise the continuation of the trend of explosive growth in the volume of data being collected by various organizations. At the same time there is an ever growing variety in the types of data – ranging from structured data originating in application databases, to more semi-structured content originating in social media, web properties such as wikipedia as well as internal applications such as email systems, company support boards etc. Additionally systems and software stacks (most notably Apache Hadoop) that are able to keep up with this growth – both in variety as well as in volume – are still complex to operate and far from perfect. They are still in the early stages of development and market adoption. As a result it comes as no surprise that many organizations struggle to keep up with operating, optimizing and making their data infrastructure work to serve their data processing needs.

Qubole Vision

The complete data infrastructure solution has many components. The main ones are as follows:

Data Collection Service for both real time and bulk upload of data from different data sources such as applications, databases, web crawls etc.
Batch Computation Service such as Hadoop/Hive to process this data and transform it from data to information.
Real Time Computation Service to generate real time results on data streams and data captures for time sensitive and actionable reporting and monitoring.
AdHoc Query Service to answer one of queries sometimes exactly and other times approximately in a short amount of time.
Tools and Frameworks for job dependencies, data and query discovery, SLA and monitoring etc.

Qubole (www.qubole.com) aims to provide all of the above components (and some more) in the cloud. We want to provide a fast, easy and reliable access to all the services mentioned above so that our clients can focus more on their data and their algorithms while we take care of optimizing, operating and evolving the data infrastructure for them. We want to enable the data engineers, data scientists and data analysts to work with their data and generate data driven applications whether these applications are simple reporting applications or more complex targeting or recommendations applications.

In the pursuit of this vision our first offering is an Adhoc Query and Batch Computation Service in the Cloud. This service provides Apache Hive and Apache Hadoop as a service with close integration with Apache Oozie. It is ideal for data stored in S3 that you want to do adhoc analysis on and on which you want to create data pipelines. This service is currently available as part of an early access program. We are working with a select set of companies in this program and we will be making this service available to everyone by Q4 2012. The details of this program and the service are in the subsequent sections of this white paper.

Qubole Team Background

Qubole was started by data infrastructure veterans from Facebook who conceived, built, managed and operated the infrastructure on which almost all of Facebook backend data processing works. The co-founders of the company (Ashish and Joydeep) are the co-creators of the Apache Hive project – a very prominent platform built on top of Apache Hadoop. The Hadoop and Hive clusters at Facebook grew under their guidance from managing 80TB of data to 20PB of compressed data from late 2007 to late 2011. The Qubole team comprises of talented engineers who have worked and delivered strong products in companies like Oracle, NetApp and Yahoo. Qubole raised money from Lightspeed Ventures and Charles River Ventures – two well known VC firms in the valley. We are seeking out organization to try out an early beta version of our service.

Access all our open Survey & Awards Nomination forms in one place >>

Bhasker Gupta

Bhasker is a techie turned media entrepreneur. Bhasker started AIM in 2012, out of a desire to speak about emerging technologies and their commercial, social and cultural impact. Earlier, Bhasker worked as Vice President at Goldman Sachs. He is a B.Tech from the Indian Institute of Technology, Varanasi and an MBA from the Indian Institute of Management, Lucknow.

Qubole – Next Generation Cloud Data Platform

Bhasker Gupta

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

KissanAI Releases Dhenu Llama 3, an Indic LLM for Farmers

Enhancing AI Integration through Optimal Data Management in the Global Convenience Food and Beverage Sector

Is it Humane to Bash Humane Ai Pin?

Meta Llama 3 Now Available on Databricks For Enterprise

How Databricks is Enabling Agriculture’s Data Revolution with UPL

How Good is Llama 3 for Indic Languages?

OpenAI Hires Pragya Misra As Its First Employee in India

Meta Forces Developers Cite ‘Llama 3’ in their AI Development

India is Making its Own AI Servers

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.

AIM Launches the 3rd Edition of Data Engineering Summit. May 30-31, Bengaluru