You need a unique blend of technical skills, mathematical expertise, storytelling, and insight to extract meaningful commercial value from data, said Kaggle Notebooks Grandmaster Karnika Kapoor. A BTech graduate from Kurukshetra University, she believes the possibility of the scope of improvement keeps life exciting.
Sign up for your weekly dose of what's up in emerging technology.
Analytics India Magazine got in touch with Karnika to understand her Kaggle Grandmaster journey and how she made the best of the pandemic-induced lockdowns to upskill herself.
AIM: Tell us how you got started with coding.
Karnika: I have a background in computer-aided design and drafting and computer-aided engineering. However, in terms of learning a programming language, I started learning a bit of the C syntax in my university while pursuing Btech in mechanical engineering. In my opinion, learning to code is always a good idea for its futuristic relevance. Coding is also a good mental exercise that enhances critical thinking and analytical skills. Coding relies on problem-solving and is quite enjoyable.
When I decided to learn machine learning around two years ago, I was torn between R and Python. I picked Python to begin with as it seemed more versatile. However, I am looking forward to learning R as well. I started by exploring several online Python learning platforms. But, my introduction and initial projects in machine learning were in Octave.
AIM: How did your first job add value to your career?
Karnika: My first job was at an Autodesk value-added reseller as a technical resource for mechanical software. It was a great starting point since it gave me access to various CAE and CAD tools. I was responsible for mechanical CAD suit training. Providing training helped me a lot, as teaching always enhances expertise. The job allowed me to grow and learn a great deal about mechanical design engineering. Working with Autodesk provided first-hand experience with cutting-edge technologies, and I developed a great deal of confidence in my abilities.
AIM: What made you interested in Kaggle?
Karnika: Andrew Ng’s lectures on machine learning were my primary source of information. That course was conducted on Octave/MATLAB and was quite useful for understanding the subject matter. I stumbled upon Kaggle and discovered that I had already created an account there a year ago while exploring resources to learn data science.
I began my exploration of the platform with the Kaggle courses. It was great to read so many great notebooks. I made it a point to refer to a few notebooks every day; reading notebooks helped me considerably. The Kaggle community is amazing, and the people on the platform share their thoughts and resources with others.
AIM: What was your first Kaggle competition like?
Karnika: My first competition on Kaggle was an in-class competition. I was a bit daunted at the beginning. I took a stab at the titanic competition and gradually increased my score by a few points but got stuck at 0.777 (score for accuracy on leader board). Beyond this score, all my approaches to increase accuracy were most likely overfitting. That was a year ago. Now I would take a slightly different path for that competition. On top of what I did back then, now I would try out a few more feature engineering techniques. Also, I would consider an Ensemble Model.
I hope to explore and try more competitions on Kaggle. The most encouraging aspect of simply trying is that it reminds you that there is still room for improvement.
AIM: How did it feel when you became the Kaggle Grandmaster?
Karnika: I was a little hesitant when I initially started sharing my work on Kaggle. My first project on Kaggle was a PCOS diagnosis. I never imagined I’d be Notebooks Grandmaster in a year. I was trying to learn something new in each of my projects. For the published notebooks, I kept them simple and clean.
I thoroughly enjoyed the entire learning process, and it never seemed like I was “working”. However, in retrospect, I realise I have spent a lot of effort and time acquiring new skills. Because I had so much fun doing it, all the hard work didn’t seem like heavy lifting. During lockdowns, I had time in hand to learn a lot. When I published my 15th notebook, many of my fellow Kagglers congratulated me in advance.
AIM: What is your advice for beginners in Kaggle?
Karnika: My top recommendation is to go through the Kaggle courses. They are well designed and help in beginning the Kaggle journey. Moreover, it also provides the knowledge to get ahead in the field. Secondly, I’d say focus on learning and mastering your skills. I’d recommend reading other people’s notebooks. I used to go through many notebooks initially, which was quite beneficial. It might be overwhelming at the start. When I started, I wouldn’t fully comprehend some of the notebooks, but as I learned more things, most of it made sense.
Lastly, there are many ways of doing the same thing. So, when you look at other people’s work, think about how you would have done it.
AIM: What excites you about blockchain and the crypto industry?
Karnika: I believe that blockchain technology represents the beginning of a paradigm change in how the world operates. The most exciting thing about blockchain technology is that it has the potential to secure a privacy-driven future. The technology is not limited to crypto and has wide applications in supply chain transparency, smart contracts, and so on. The use-cases range from medical records to real-estate records and art to music. Blockchain makes everything transparent and decentralised. When it comes to the internet of the future, Web 3.0 is a very promising blockchain application.
AIM: What makes a good data scientist?
Karnika: Enthusiasm to learn about new subjects: Each issue statement or project in data science might come from a completely different field. The interest in learning more about a subject is essential. If one has an innate curiosity, they have the potential to be a good data scientist.
Analytical aptitude is necessary: Critical thinking and problem-solving are imperative to understand the business problem. How someone perceives a problem is essential for finding the solution. Data analysis can provide a plethora of information, but identifying what is relevant to the problem at hand is paramount. In addition, the interpretation of data is vital. For example, the association between various features sometimes appears straightforward until you dig a bit further and discover the previous assumption collapse.
Good understanding of mathematics is a prerequisite: Understanding maths is essential to understanding how the model works. Linear algebra, statistics, and calculus are the core subjects that drive data science. It is feasible to implement a model without understanding the underlying maths. You can do so by importing libraries and fitting the model to get the result. Nonetheless, this approach does not accommodate a variety of problems that require an insight into the model.
Working knowledge of coding and a good grasp of ML algorithms: The value of communication skills is quite significant. A data scientist must have the ability to communicate with various partners. Storytelling is the final and most crucial step of a data-driven project’s pipeline. Adaptability and upskilling are also crucial.