MITB Banner

The story of DagsHub: Interview with founders, Dean & Guy

DagsHub integrates best-of-breed tools for machine learning in a modular way, providing an end-to-end experience for collaborating on data, code, models, experiments, and pipelines.

Share

Tel-Aviv based DagsHub was launched in 2019 by Dean Pleban (CEO) and Guy Smoilovsky (CTO). Dean has always been a builder at heart, and before building DagsHub, he has worked on quantum optics and communication, computer vision, software development, and design. He loves taking a multi-disciplinary approach and applying it to build products for data scientists and machine learning engineers. Guy has always been a tech geek and has worked in software, information security, data engineering, DevOps, and machine learning. 

The origin story

Dean and Guy have been friends since kindergarten. Both had experience working on machine learning projects, and Guy also had a lot of experience in the DevOps and data engineering side of things and a deep understanding of tools and workflows for developers. 

“We noticed something interesting: If you are just working with code, your life is pretty good – you have tools that help you build your project, work as a team, and deploy your work to production relatively easily and at scale. But when you throw data into the mix, things start to fall apart. You regress to sending chunks of your project over email or telling team members “don’t touch anything, and I’ll let you know when I’m done, and you can start working” – similar to developer workflows before the era of Git. It seemed strange to us that that was the case. Since we both have an engineering mentality, we assumed that smarter people than us had this problem, and they had already solved it. So we went to find what solutions existed and realised that there were no solutions, at least not the way we imagined them. There is no standard workflow, and there’s no central hub like a GitHub for machine learning. We felt like we had enough to contribute from our experience and backgrounds to start building something, and that’s how the idea for DagsHub was born,” Dean said.

What’s DagsHub?

DagsHub integrates best-of-breed tools for machine learning in a modular way, providing an end-to-end experience for collaborating on data, code, models, experiments, and pipelines. Similar to how GitHub connects with Git and provides a collaborative experience and a central source of truth for code repositories, DagsHub integrates popular open-source tools (like DVC, MLflow, Label Studio), and tools data scientists use to manage different components within a data science project and build out the project context so that their code, data, models, experiments, and pipelines are managed and visible from a single point.

“When we say collaboration, we mean community, but also production-oriented collaboration: Two people living on opposite sides of the globe should be able to work together via DagsHub – this is important to enabling open-source machine learning. But also, if you think about GitHub as a community, the projects that the community works on are useful in real production settings, so what we’re promoting with DagsHub is production-oriented and not just for example projects.”

Engage

As an entrepreneur, Guy said it is important to be surrounded by good, smart people – whether they’re your co-founders, employees, advisors, investors, or even customers. 

“One thing that comes to mind is that people are more accessible than you might imagine. Today with social media and the internet, you have an actual path to speak to really awesome people. One of the things we learned is that you can send people messages, and you explain to them why their time or advice can help you, and why it could be worth their time – I don’t mean this from a compensation perspective, but more like explaining that we are trying to do something amazing and your expertise or knowledge can make a meaningful dent here – people will be happy to engage with you,” Dean said.

Networking

“It is never too early to start building out your network. Building relationships is probably the most important thing you can do. You never know what the future holds and when a relationship will need your help or when they will be able to help you. So connecting with people who are either like-minded or have a completely different background that is working on interesting things is almost always a great idea,” Dean said.

“This might sound obvious, but I can’t count the number of times we reached out to amazing people that we were sure didn’t have time for us, and they ended up responding and spending a lot of their very precious time to help us. If you’re working on something you believe in, and you need concrete help from someone, don’t be afraid to reach out and ask. You’d be surprised how often they’ll respond and be helpful”, he added.

Role of data scientist

Data scientist is a catchall term. “If your job is to analyse business data to generate reports, you need a strong understanding of the business, statistics, data crunching methods like SQL, good visual and communication skills, etc. But it’s not so bad if you don’t know how neural networks work or the finer points of cloud infrastructure. If your job is to train and deploy deep learning models in self-driving cars, you need strong math skills, a strong understanding of how the computing and sensors in the car work, physics, efficient production-grade programming, etc. If you’re good at that, you can get by with the minimum level of business and communication skills necessary to be a functioning employee. You might even get by without a strong working knowledge of statistics,” Guy said.

Advice for entrepreneurs

  • At least one of the founders should have a ground-level understanding of the field. ML is not an easy subject to learn on the fly, and it will be extremely hard to outsource or hire for.
  • Find people who have done similar things and ask for advice. Grab threads and keep pulling on them to build a network. A good way to do this is to be accepted into a good accelerator. (How do you know which accelerator is good? Ask people!) Also, don’t latch on to every piece of advice from the first person who gives it. You have to use your intuition and keep talking to more people until you get a sense of who to listen to.
  • The most important thing is the people you surround yourself with, especially your connection to your co-founders – the top reason for startup failure is a fight between founders. If you or your partners are willing to sacrifice your relationship for a few (or many!) millions of dollars, you will most likely fight and break up before the company succeeds. Companies can survive these fights, especially if they’re experiencing amazing growth, but it’s rare and, in my opinion, not worth it.
  • Do your thinking and homework before asking people for advice. Many people will be happy to help, but not if you try to use them as an outsourced brain. The preparation will show through when you ask for advice, and this will help in getting the best people to want to help you.
  • Make sure any people you’re personally responsible for are safe and on board with what you’re doing. It would help if you had sufficient personal savings so that you don’t go into it hungry for cash since that will lead to suboptimal decision making and, paradoxically, make you less appealing to investors. Otherwise, have an extreme hustle ability and just become profitable fast, and then you both have money to survive and have an easier time raising money from investors. Without some minimal financial base, it’s very hard to have the ability to do this, though, of course, inspiring stories do exist.
Share
Picture of Sreejani Bhattacharyya

Sreejani Bhattacharyya

I am a technology journalist at AIM. What gets me excited is deep-diving into new-age technologies and analysing how they impact us for the greater good. Reach me at sreejani.bhattacharyya@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.