Listen to this story
Janio Martinez Bachmann’s life can be summed up in one phrase: “Mama ho haw ho, wow wow!” The Kaggle Grandmaster, who loves to play Mario on Nintendo in his spare time, works as a data analyst at Voodoo.io. Janio is from the Dominican Republic and has a postgraduate degree in Financial Planning from Humber College, Canada.
In an exclusive interview with Analytics India Magazine, the financial analyst turned data analyst shared his story of becoming a Kaggle Grandmaster.
AIM: How did your fascination with algorithms begin?
Janio Martinez Bachmann: Most of my experience comes from the financial industry. I used to work at a Credit Bureau in the Dominican Republic. I was highly dependent on tools such as Excel for my day-to-day tasks. Though I did enjoy my job, I always asked myself if there was a better way of doing these repetitive tasks more efficiently. So, I started digging into the topic of programming languages such as Python and bought a book – Hands on Machine Learning with Scikit-Learn and Tensorflow by Aurelien Geron– that changed the way I think about algorithms and data science.
Sign up for your weekly dose of what's up in emerging technology.
The book taught me how different algorithms, such as linear regression, Decision Trees, unsupervised models (Clustering) and more, work. When I started reading this book, data science was not so hyped. I was not sure what I was getting myself into. However, I loved the mechanics of how different models worked and how you could solve business problems by using them – this was something that fascinated me.
AIM: What were the initial challenges, and how did you address them?
Janio Martinez Bachmann: I must be honest. I was not a maths guru–neither in high school nor in college. One of the toughest challenges I had was understanding how models work. It felt like getting into a black room without a light bulb.
However, my curiosity about different algorithms pushed me to understand how these black-box algorithms functioned. So, I started following many Youtube channels; Joshua Starmer is one of my favourites.
I remember doing an exercise about how a DNN (Deep Neural Network) came up with a specific output. I had to do both forward and backward propagation on paper by implementing calculus concepts learned on the internet.
You don’t need to have a PhD to know all these things. Curiosity is good enough. My advice to beginners would be to enjoy the ride and not get intimidated by all the terminology; all these concepts can be learned from the internet.
AIM: What about coding excites you the most?
Janio Martinez Bachmann: What I most enjoy about coding is that you have endless possibilities for getting your work done. As a data analyst, I am constantly challenged to find insights that will allow my employer to leverage opportunities in the market. But, how could I provide insights when working with massive amounts of data? The beauty of coding and open source packages comes into play here. The ability to code is like having superpowers! The possibilities are endless as to how to tackle a problem when you know how to code using different tools! This is what I most enjoy about coding–the creativity it brings and the efficiency in solving day-to-day problems.
AIM: How do you get into the zone?
Janio Martinez Bachmann: Believe it or not, I’ve had lots of struggles initially to get into the zone. Nowadays, distractions come from all angles, and it’s hard not to get distracted. However, when you need to pay attention to details (common when coding), it is critical to be in a state of mind of concentration. So, what does my routine look like?
First, I hide my phone far away from my desk to get into the zone. Why do I do that? My mobile is my main source of distraction since I tend to get constant notifications from there, and the closer I have my phone, the more tempted I will be to see what that notification is. So, to avoid that temptation, I usually place my phone in a place hard to reach from my desk.
I am an early bird. The first thing I do is to prepare my daily task list. This gives you a better perspective on what things you should accomplish during the day, giving you a better sense of direction. There is nothing worse for me than starting the day without knowing what I will do. I will feel completely lost. Once I have my task list prepared, I feel like I have a sense of purpose during the day. My daily task list would be the first step before getting into focus mode.
AIM: What does your ML tool look like?
Janio Martinez Bachmann: The most common tools I used include:
- SQL (Structured Query Language): I mainly use this to extract all the necessary data directly from the database. Here, I perform transformations necessary to be analysed after or display that information through a BI tool.
- Tableau: Talking of BI tools, this is the dashboard I currently use to display all the necessary insights to stakeholders. There are other platforms such as PowerBI, Looker, QlikView etc.
- R & Rstudio: I mainly use R for performing statistical analysis and A/B testing processes, but there are other functionalities such as data transformation, visualisations and many more.
- Python: I tend to use Python to automate processes that tend to be repetitive.
- Shiny Web Apps: I use them as a sort of dashboard. The only difference is that it has more flexibility to integrate machine learning models into the web application.
- DBT (Data Build Tools): It’s the latest tool I’m currently learning, but this will be a game-changer, and I will say it will be a must to learn in the foreseeable future. It’s a tool that uses software engineering principles to transform, test and document all your tables. I currently use this tool together with Redshift.
- Git: This is a tool that anyone will eventually need to learn since, in most organisations, you will need to work collaboratively with your code. By knowing Git commands, you will be able to work with Github, GitLab, Bitbucket and many more collaborative tools.
AIM: How to prepare for the first hackathon?
Janio Martinez Bachmann: In the hackathons that I participated in, I have mainly used Python to solve problems. So, my suggestion will be to start from there since it’s the most common language I have seen being used in Hackathons. However, in terms of libraries for machine learning, I recommend learning the basics of Pandas, Matplotlib and Scikit-Learn and concepts such as loops to have more flexibility when manipulating data.
AIM: What’s your biggest pet peeve about hackathons?
Janio Martinez Bachmann: I’ll be honest, when I did my first hackathon, one of my main challenges was collaborating with others. It’s not like I don’t enjoy collaborating with others. I tend to get nervous when I must code next to a person. Have you ever blacked out when you must show your code or work on a screen? Well, something like this happened to me in my first hackathon.
However, we should have in mind that none of us are born coders. So, my advice would be, don’t be afraid to participate in hackathons. See this as an opportunity to learn from more experienced folks in the field.
AIM: What’s the worst experience you’ve had as a coder?
Janio Martinez Bachmann: As an analyst, I constantly interact with other stakeholders to visualise what those stakeholders want. One of the worst experiences working as an analyst is for you to deal with a stakeholder that asks you for something but entirely does not know what they want. In a work environment, this can be demotivating since you feel like you must somehow guess what that person wants. Fortunately, there are techniques to deal with these situations and the one I would suggest implementing is to ask questions constantly. By asking questions, you will be able to define the problem, which will allow you to elaborate on how to tackle a specific problem or request.
Another not so nice experience I’ve had was when I elaborate a project to the end for a large number of stakeholders, and only a few of them use them. It has been demotivating and frustrating because some stakeholders could ask for things in the sense of urgency, making you feel that stakeholders need this. However, only a few find the end project useful when it is complete. This has happened to me a few times, especially when elaborating dashboards. To counter this, I will go back and ask questions! And most importantly, ask whether this project is necessary and how it will impact the organisation.
AIM: What drew you to Kaggle? How has your journey been so far?
Janio Martinez Bachmann: I heard about Kaggle when I started reading “Hands on Machine Learning with Scikit-Learn and TensorFlow by Aurelien Geron”. Kaggle was mentioned in the first few pages. I was curious to see what this website was and when I saw it for the first time, I was fascinated with it! Why? Being a beginner in coding, this platform was perfect for applying the theory I was learning from reading books. There is nothing better than learning to code while exploring some datasets and getting the story from a specific table.
The data-storytelling part was one of the things that drove me to Kaggle and, most importantly, the amazing community that is out there to help you. Learning from the notebooks of talented individuals allowed me to improve my coding skills and learn different machine learning concepts. As for my journey, I have to say it has been tough, but worthwhile. I have been off Kaggle lately, mainly due to my current job. But I plan to contribute to Kaggle to help the community.
AIM: What was your first Kaggle competition like?
Janio Martinez Bachmann: As far as I can remember, the first competition I participated in was predicting housing prices. It was an interesting competition because it was the first time I heard about feature engineering (mainly a concept in which we extract insightful features to enhance the predictive capabilities of our predictive models). Also, this competition allowed me to learn interesting advanced linear regression concepts that I had never heard of before. Nevertheless, you can guess I did poorly in this competition as it was my first one. But I learned a lot, and that’s what matters! So don’t be afraid to participate in competitions; they can be fun!
AIM: What was it like to become a Kaggle Grandmaster?
Janio Martinez Bachmann: I was in a state of shock. I remember I was on vacation in the Dominican Republic in March 2021. I was lying on the beach, and I received the notification from Kaggle that I became a Kaggle Grandmaster. I couldn’t believe it, but I was happy about it at the same time! After four years of dedication, I became a Kaggle Grandmaster. This does not mean you need to wait four years to become one. I’ve seen other Kagglers becoming Grandmasters in even two years. Nevertheless, I was full of joy when receiving the news from Kaggle!
AIM: Tips to ace Kaggle competitions.
Janio Martinez Bachmann: Here are my tips for moving to the top in Kaggle:
- Creating content: When I say creating content, I mean exploring datasets that only a few have explored and that you might think would be attractive to the community. I could relate to one example when I explored an interesting topic back in the day in dealing with Imbalanced classification. Back then, this topic was not “happening” in Kaggle, so I decided to take this opportunity and create a notebook revolving around “Credit Fraud || Dealing with Imbalanced Datasets”. It took me three months to create this notebook, but it was worthwhile, and currently, it has almost 4k likes.
- Participating in discussions: If you want to promote your brand in the Kaggle community, I would suggest participating in the discussion section mainly for two reasons. You will get to know other Kagglers through many discussion topics, and you will learn with them in all these discussions. It’s a great way to let yourself be known in the community.
- Respect the community: When I say this, try to behave ethically across the community. I have seen some unethical behaviour promoting your notebook across different notebooks so that people like yours. However, I would suggest not to do this even if you might feel tempted to do it. One, other users will not like it when someone directly requests this, and two, it might seem a bit unprofessional to do, which will ruin your reputation. That’s why it is important to create content but, most importantly, enjoy the ride! It does not matter if you are Grandmaster or Master; what is important is that you are learning many interesting topics across an engaging community such as Kaggle! Be patient!