“Kaggle competitions are probably the most efficient way to master the field of machine learning.”
It has been a year since Analytics India Magazine kicked-off the Kaggle interview series. We have interviewed top Kagglers, who have been kind enough to share deep insights, tips, and tricks from their journey to the top. For this week, we got in touch with Dr Christof Henkel aka Dieter, a Data Scientist at NVIDIA and is currently ranked 4th on Kaggle leaderboard. He has also earned the coveted 3x Grand Master within a very short period of time on Kaggle. In this interview, he provides a glimpse of what it takes to be at the top of the game.
How It All Began
Dr Christof holds a PhD in mathematics from the Ludwig Maximilians University in Munich where he also did his masters in Business Mathematics. As part of his PhD thesis, he worked on solutions that bridge statistical mechanics to the financial markets. For this, Dr Christof created a framework for modelling agent interactions and deriving price processes of financial assets.
When asked about his introduction to the field of AI, Dr Christof expressed his fascination for the idea. He said, “I have always admired scientists working in that field. When I was in the final year of my PhD, I squeezed some spare time to learn new things. I started watching deep learning tutorials and then picked Python and jumped right into my first Kaggle competition. Of course, I did quite bad, but learned a tremendous amount of knowledge and realised that Kaggle competitions are probably the most efficient way to master the field of machine learning.”
“I am not a fan of books in the field of deep learning. They get outdated too quickly.”
Dr Christof’s self-learning routine consisted of a few high-level YouTube videos on neural networks followed by the popular Andrew Ng lectures on Coursera. “At some point, I also watched Jeremy Howards Fast.ai course videos which cover all the topics of deep learning in an understandable way. I am not a fan of books in the field of deep learning. They get outdated too quickly. Doing a single Kaggle competition teaches you more than any book ever could. Currently, I mostly rely on research papers to get inspiration for new ideas,” he explained.
He is currently part of KGMON, which stands for the Kaggle Grandmasters of NVIDIA, a team of top Kagglers. As part of this team, he leverages his data science acumen to top the Kaggle competitions using NVIDIA’s tool stack. In a way, he and his colleagues provide feedback to NVIDIA’s software stack by using it in Kaggle.
On Kaggle Days
“I not only never used Python but also lacked software development skills in general. I also did not have much computational resources.”
Dr Christof is currently ranked 4th in Kaggle leaderboard. His accomplishments might seem overwhelming today, but his beginnings, like most aspirants, were humble. Firstly, he had to learn Python; and, thanks to his stint with R from his doctoral days, he got the much-needed head start to get a hold of Python programming.
But there is more to becoming a good data scientist. So in order to get a comprehensive understanding of deep learning, Dr Christof jumped straight into Kaggle competitions. He struggled at first, as he was blown away by the level at which top data scientists are operating on Kaggle. However, this made him enthused by the quality of ideas and code that was being shared on Kaggle discussion forums. “Not only does that create a sense of community where people are solving a problem together, but it is also the most effective way to learn for beginners, advanced and even professional machine learning engineers,” said Dr Christof.
When asked about his process of learning to solve problems, Dr Christof spoke about his 5-step guide to tackle any Kaggle competition:
- Firstly, it is required to conduct a very simple data exploration to get a rough idea about the data and the problem at hand; just enough to understand what good cross-validation should look like.
- Then, simultaneously one should build a very simple first model and check if the correlation between the local validation and the competition leaderboard is good.
- If the correlation is not satisfying, one should iterate as long as needed to understand possible discrepancies and account for them.
- Dr Christof believes spending the rest of the competition exploring ideas that are inspired by research papers, Kaggle discussions or Kernel.
- During the last week of competition, he concentrates on model ensembling and checking the robustness of his solution.
“It’s always helpful to use a scientific approach. Quite similar to how physicists or chemists conduct lab experiments. Clean tracking of experiments would help in understanding what you would expect from those. Why do you think the results met your expectations or find the rationale behind why it’s different,” advised Dr Christof.
Tools, Tips & Tricks Of A Grandmaster
“All my work is done in Python. For small competitions, where a lot of data exploration is needed, I prefer to use Jupyter notebooks.”
When it comes to frameworks, Dr Christof, although, started his journey with TensorFlow, he changed to Keras once it was released. However, now he uses PyTorch for training deep learning-based models due to its flexibility and stability. “It’s also easier to use in a multi-GPU setup which becomes more and more relevant to Kaggle competitions. As soon as I want/need to accelerate computations when having tabular data at hand, I use RAPIDS, which provides Sklearn or pandas like interface but is running on GPU. During the years, I also tried a lot of auxiliary tools for logging, data storage, etc. Currently, I use neptune.ai for logging and AWS for data storage, but that might change quickly,” he revealed.
Dr Christof’s workstation is powered by two deep learning units. One has 3x NVIDIA RTX 2080Ti, and the other one is a DGX Station with 4x V100, which was given to him by NVIDIA. “This is more than enough for a good placement in a Kaggle competition. I got very good results with my former machine which had 2x GTX 1080Ti,” he added.
“I prefer having my own workstation to have all my code and data locally. But lately, I also started to use cloud solutions more and more to scale short-timed demands.”
For more compute intense competitions Dr Christof prefers scripts, which according to him, can automate steps like hyperparameter tuning or deployment for inference. He also underlined the significance of GPU acceleration when one needs to iterate quickly through ideas. Though he professes to have data locally, he confessed that he has started to use cloud solutions more and more to scale short-timed demands.
To reassert his data science problem-solving approach, Dr Christof picked one of his favourite competitions — the Bengali handwritten grapheme classification– as an example. Given an image of a handwritten grapheme, the competition required participants to classify the grapheme into three components — root, vowel diacritic and consonant diacritic.
“I started with resizing the given images to a small size of 64×64 to iterate more quickly through ideas in the first half of the competition. So I built a resnet18 baseline and replicated the competition metric to check if my local validation score matches the leaderboard score, which it roughly did. Next, I moved through different augmentation methods and model architectures and settled for one that worked sufficiently well. Although I used a larger image scale of 128×128 in the next step, my score was stagnating. So I read a lot about the Bengali writing system and understood special cases of the language and subtleties. That enabled me to improve my solution and finish on a top spot as a solo competitor,” he explained.
Check the full solution here.
Few Words For The Beginners
“Having less resources forces you to think more.”
Talking about the future of ML, Dr Christof explained why supervised learning as a concept, to fit a function to data for predicting a target, seems quite appealing. “It’s so flexible and modular that you can solve arbitrary problems with that. Especially problems where you have structures like language, audio or vision benefit from the flexibility of deep learning. You can see already that deep learning is way ahead compared to classical statistical methods.”
That being said, he also emphasised on the need for having a good knowledge of statistics and linear algebra for machine learning and everyday work of a data scientist. “It’s important to understand concepts like distributions, randomness, matrix multiplication or probabilities to explore and understand data and make meaningful predictions. Advanced topics like, for example, stochastic processes or calculus are not necessary but helpful for understanding loss and metric dynamics while training your models,” he explained.
People claim that machine learning, especially deep learning, is a black box, and one cannot understand how a model reaches its conclusions. They also demand that models should have near-perfect accuracy. But, according to Dr Christof, the usefulness of a model or algorithm should be evaluated by comparing it to human-level performance.
The hype around AI can be overwhelming to the newcomers. So, Dr Christof suggests one to dive into Kaggle competitions for exposure. “Doing various Kaggle competitions certainly comes handy,” he admits. “Because you have problems from a variety of different domains and need to be quite flexible in using (and sometimes adjusting) various Python libraries and dig deep into the related source code. So you also get to know how more professional developers organise and set up their code, and you can learn from that.”
Dr Christof likened the field of ML to other fields like sports or music and to be good at it; one needs to be very passionate about what they’re doing. At least passionate enough to stay motivated to work hard and continuously improve yourself and your work. “I try to work on my deficits, and that’s how I learn the most. Hence I try to enter new competitions with domains that are outside my comfort zone — like, reinforcement learning. I pick up new frameworks or tools. This keeps me motivated,” concluded Dr Christof.