MITB Banner

Are Databricks and Snowflake Ferraris in a Toyota World?

Companies looking to cut costs shouldn’t rely on open source solutions either. It's a juggle

Share

Listen to this story

While opting for data architecture solutions, companies frequently fall into the trap of paying exorbitant prices for services they don’t need. A recent blog by Kieran Healey points out that companies like Databricks or Snowflakes are offering Ferraris when many companies could do their work with Toyota. 

Databricks and Snowflake are undoubtedly robust platforms that offer impressive capabilities. Snowflake’s partnership with NVIDIA and Databricks’ integration with the Spark Human API showcased their technical prowess and made it even bigger. Yet, such features often serve as marketing tactics rather than essential solutions, which companies end up paying instead of open source solutions.

For example, instead of opting to pay the price of an LLM-based chatbot, most companies could effectively address their data challenges with simpler, more cost-efficient solutions such as a simple “press 1 to choose this option”. But when it comes to addressing data-related challenges without overspending, companies should adopt an anti-hype mindset. 

A person from Databricks suggested on HackerNews that though companies might be able to create their own Spark deployment, it will run much slower than how it runs on Databricks or its proprietary runtime. He further adds that a lot of businesses have other problems to solve and focusing on building DIY platforms is a horrible approach.

Interestingly, none of this matters if you only have gigabytes of data as the company can use pretty much anything very cheaply and easily. It is just about companies that have terabytes or hundreds of terabytes of data. 

Open source vs commercial solutions

On the other end, it seems easy to hop onto the open source solutions as well, given the cost-effective value that they are presented as. One side of the debate emphasises the financial advantage of open source solutions. Supporters highlight the fact that open source software is often free to use, suggesting that the cost savings alone make it a compelling choice. 

However, it is essential to be pointed out that while the software itself may be free, deploying, maintaining, and expertly managing open source solutions can incur significant costs. Paying skilled professionals to ensure proper deployment and upkeep can strain both time and resources.

“Open source it may be. Free it is not. Paying an expert to correctly deploy an open source solution takes time and money,” said another user. This argument underscores the idea that simply adopting open source software isn’t a guaranteed money-saving solution without proper expertise and management.

On the opposite side, commercial solutions such as Databricks and Snowflake might come with upfront costs, but offer comprehensive support, integration, and scalability that can be invaluable. These solutions often package features, support, and maintenance into a single offering, reducing the need for extensive in-house expertise. Furthermore, commercial solutions can provide a level of assurance and accountability that can be lacking in open source alternatives.

Though you pay to change the parameters of the problem. This is a fundamental misunderstanding of how to get things done in a constrained environment. This viewpoint highlights the notion that the trade-off between open source and commercial solutions is about more than just cost—it’s about shifting the focus from technical challenges to non-technical ones.

Funnily, it’s like saying no company needs a cloud provider but it definitely helps them focus on better things instead of building a data centre themselves. 

The Anti-Hype Approach

In the debate over data platform choices, context and expertise play pivotal roles. While open source solutions can be powerful tools when implemented correctly, they require a skilled team to navigate potential challenges. Conversely, commercial solutions can mitigate many technical complexities, enabling organisations to concentrate on their core business goals. However, this often involves a trade-off between flexibility and vendor lock-in.

Ultimately, there is no one-size-fits-all answer to the open source vs commercial debate in the context of data platforms. The decision depends on the unique circumstances of each organisation—its budget, existing expertise, scalability requirements, and risk tolerance.

In the current age, when CEOs are being pushed to say generative AI by everyone, it might be easy to fall into the trap and overspend on over engineered solutions. It’s essential to scrutinise its applicability. Instead of focusing on novel technologies, companies should adhere to the age-old principle of delivering tangible returns on investments and CEOs are always looking for solutions that not only enhance but also generate profits.

Share
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.