MITB Banner

Will HarperDB Replace Hadoop In The Near Future?

Share

There are a plenty of options when you want to switch over to Big Data from traditional data-handling software — for example Relational Database Management Systems (RDBMS) provided  by IBM, Oracle among others — but they lack the capability to process huge and complex datasets. The options are so numerous that you might even be surprised to go ahead if you choose to stick with just one software.

Most of the companies that cater to specific database requirements, such as the underlying technology or analytical complexity, can afford to go big when it comes to data management systems. Tech companies these days prefer NoSQL for transaction-related data, SQL database to run business apps and then proceed to pick Hadoop or Scala for analytics and a middleware to combine all of these areas to perform as a single entity. This means a lot of investment and a lot of skills on board.

SQL And NoSQL: The Distinction

At a time when Big Data is making waves, databases such as SQL and NoSQL are slowly falling behind. One should not ignore these databases due to their own respective benefits such as high speed data retrieval, little or no coding, and definite standards for SQL. On the other NoSQL has advantages such as enhanced scalability, support for object-oriented programming and handling large amounts of raw, and unstructured data, among others.

Big Data relies on both of these separately. It all depends on what the company wishes to implement. If the company prefers a structured and standard database, they can go ahead with SQL. If they prefer scalability and flexibility, they may choose NoSQL. Ultimately, it is the company’s approach to Big Data that matters the most.

HarperDB Is Here!

Enter HarperDB, a big data software founded in 2017 which integrates the support and functionality of both SQL and NoSQL on a single platform, according to the company’s CEO Stephen Goldberg. This dual advantage will provide NoSQL without affecting the SQL components such as advanced math functions, SQL Joins and multiple operators. The focus would now mainly be on the large datasets without worrying too much about programming languages.

HarperDB founders, Goldberg and Kyle Bernhardy, worked to build a single model to satisfy both SQL and NoSQL criteria, calling it an “exploded data model”. In this model, the Javascript Object Notation (JSON) entity or a SQL query is integrated to form an index table. This will eliminate the need to assign foreign keys which increases disk footprint. This forms a single table search which could be used to create SQL Joins, among other functions.

The highlights of the software product are listed below:

  • The software is completely written in Node.js to accommodate smaller memory footprint. This makes the product run on most devices including Raspberry Pi. Moreover, the product is designed to use minimal resources such as CPU and battery life. The company emphasises that the user can focus more on coding rather than configuring the storage or database management.
  • A representational state transfer technology, application programming interface (REST API) is used for HarperDB to implement SQL and NoSQL by obtaining codes from their website without any fuss. This shows that HarperDB stresses on coding instead of fiddling with the database system.
  • Although, the product is available for free, they even have an “Enterprise Edition” that supports additional features such as Clustering, Replication, ODBC and JDBC driver support, among others, for a certain fee. This is specifically focussed at organisations with needs at an advanced level.
  • The product uses “Native Indexing”, where each attribute or entity is stored as a separate record facilitating them to be fully indexed without any additional overhead. This means, the entities can be accessed according to the categories defined by the user.
  • Another useful feature is HarperDB uses vertical scaling, (which means adding more power to the existing machine such as CPU and RAM) which automatically adjusts to the machine’s capability. The “Enterprise Edition” offers this along with clustering for data replication and data distribution in the form of a table.

The company also has an aggressive focus towards Internet of things (IoT) and Hybrid Transaction/analytical processing (HTAP) to tap into more avenues of big data. The HarperDB team has a strong expertise when it comes to database management. Although, the company has only nine employees as of now, they plan to hire and expand on a larger basis. They received a funding of $1.2 million and aim to raise even more money.

Conclusion:

HarperDB is a new entrant in Big Data and Analytics market. So, it might take a while to set a strong foothold in this area. With other dominant Big Data providers such as Apache’s Hadoop and Spark in play, it should plan diligently to proliferate the analytics industry.

Share
Picture of Abhishek Sharma

Abhishek Sharma

I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.