Recently, I got offered a job as a Data Scientist at a real estate startup. Getting a data science job is never a straight forward process. Let me uncover for you my story on how I achieved this milestone, and what exactly happens behind the scenes in one’s journey towards being a data scientist. I would not tell you to exactly follow the path I took. There were a lot of good and not-so-good materials I encountered along the way. In the end, I’ll brief out an all-good path for you to achieve a data science job much before I did. Till then just hear my story out, it’ll be messy, ain’t real life a mess?
Where I Began
I was enrolled in a bachelor’s course majoring in Computer Science from a not-so-reputed university. Yes, I wasn’t exposed to any kind of overwhelming tech culture, research facilities even if they exist were in their infancies.
I want to start this by making a concrete point. You might be in a better or worse position than this, but you cannot absolutely blame your study-place or surroundings for the lack of opportunities they provide or give up on your fate just because you are not enrolled in a top-notch university.
You have no right to complain
We live in the age of the internet, where one can study, skill-up and learn almost anything by themselves. You have MOOCs for everything, you can get online mentorship at cheaper rates, you have great encouraging online communities.
Let’s Just Get Into What We Are Here For
In May 2017, at the end of my sophomore year, I started looking for a Software Development Internship. I applied at various job boards and ended up getting an internship at a travel startup. I was assigned the task to build a content analysis tool. There I was introduced to concepts like Natural Language Processing and other data analysis tools. The problems we were solving there were very unconventional and I started getting inclined more and more towards data science. By the end of the internship, I made up my mind to further study how problems can be solved using data.
In my university, I started looking out for data-related electives. I started looking for MOOCs. I wanted to start with a more “applied” course first so that I could get a gist of what I could gain from doing data science. I enrolled in the Coursera’s University of Michigan Applied Data Science Course. It was a specialization which included 5 courses namely:
- Introduction to Data Science – where I was introduced to various Python data processing frameworks like Pandas and NumPy.
- Data Visualisation – where I learned to use Matplotlib and understood some good practices and things to keep in my mind while doing visualization stuff.
- Applied Machine Learning Course – here I was introduced to various regression and classification algorithms. It didn’t go deep into maths but excellently gave intuitions behind the algorithms we’ll be using.
- Applied Text Mining – this course had a good bunch of exercises; I learned about LDA for topic modelling and vectorization techniques.
- Social Network Analysis – I didn’t take up this course as I was already overwhelmed about the above ones and wanted to dig much deeper into them. But I would be doing such a course definitely in the future.
This was how I gained an overview of what types of problem can be solved using data. Ah, but I kinda do not recommend this course.
Fast-Forwarding To Summer 2018
A year passed doing MOOCs, college courses and the next summers came and I made up my mind to take a core data science internship this time. I applied for internships, I somehow got shortlisted for some, I started the process by doing some take-home assignments but failed to make any sense out of them. I dropped the plan to do an internship and thought to start afresh, which was to devote my summers only learning. I was fascinated by one fundamental subject in computer science which was “Data Structures & Algorithms”. I decided with one of my friends that we’ll cover up all the standard concepts of DSA and participate in competitive programming. I loved solving problems on HackerRank and CodeForces. We found an excellent resource for competitive programming which was this CommonLounge Playlist.
I might be sounding off-tracked as I speak about competitive programming, but I just wanted to point out that DSA has played a very important role along my journey of writing good code, thinking about solving problems “peripherally”, in very creative ways. I wouldn’t say anyone to get into competitive programming for Data Science, but the knowledge of Data Structures and Algorithms will for sure help you a lot. To the utter co-incidence as I write this, here’s a popup in one of my Data Science Whatsapp groups. (I didn’t make this up, trust me!)
It is not a mandatory thing, but having this will sharpen your vision about analyzing the complexities and involving good ways to solve problems.
We completed the playlist understanding and solving binary search, sorting, dynamic programming, recursion and graph problems. From that time until today I still participate in online competitions whenever I could.
Someone Said Deep Learning
It was time when deep learning had gained a lot of hype. On my LinkedIn, I could only see people completing Andrew Ng’s Deep Learning.ai course. In no time I got enrolled myself into it, and that’s when I went deep into mathematical details behind logistic regression and neural nets’ algorithms. That was when things started becoming overwhelming and intriguing at the same time. I completed the first two courses and slowly started to peek what Tushar was studying, Andrew Ng’s Stanford Machine Learning course.
Yes, I messed up. I did deep learning first, went back to study the traditional ML algorithms in-depth and understood the math behind it. Somehow, in the end, I could make sense out of everything.
The Placement Season
As I said, I was in a kind of tier-3 college and we do not have big shots, well-paying startups coming here for placement drives. I made up my mind to only sit for a highly technical job where I could somehow make my way to get experience in solving data-related problems. One benefit in being from a tier-3 college was to outshine amongst others easily. The reality really was that the competition was not fierce to get a job. If you are actively learning out of the degree curriculum you are far much ahead than your peers. So I was selected in the very first company I sat for. Fidelity International. Had a lovely interview experience. Awesome people. The interview was somewhat inclined towards AI and data science, as that was mentioned the most in my resume.
I was the one with the lowest GPA (7.2) being selected among the other 9 point rockstars. We had around 6 months time before we could begin with the internship. I was involved in my Final Year project which was creating a device for detecting the freshness of the fruit. We’ll have another post to discuss this.
Last Semester, An Internship
The day came, when I was all set to officially get into the corporate culture, that too in an MNC. I don’t know why, but I do not have a good feeling for working in an MNC, maybe kicking the core part of the business is not easily doable there.
So our training started, we were being trained to learn Java, Oracle, and other Java Web Development Frameworks.
A nightmare for an aspiring Data Scientist.
I did not like Java much, still, things went on. I did well, solving all the algorithmic problems our instructor gave us. It was project time, wherein interns were allocated some in-house projects. I suggested one of my own, which was to predict stock market trends using tweets from twitter. Research work on this is being done at some places to some extent. We were approved to do this here as it also closely related to the business Fidelity was into. We had a full three week’s time.
I again took another MOOC, which was Andrew Ng’s RNN course so that I can understand and apply some state-of-the-art learning algorithms for the NLP task I was going to do. Then came, another nightmare.
“Do it in Java.”
As our training was done in Java, we were to use only Java in whatever project we want to pursue. I seriously don’t want to get in how we completed this, but we did. Really, we did meet our targets. Our project display was a success.
Done With The Training, Jumping Into Teams
So we were 2 months into the internship period when we were assigned our respective teams and the actual work began. I knew I will screw up very badly if I was given web development work. I had previously done a lot of web development work, and even being good at it, it made me frustrated. It was not my cup of tea now. I had already told my mentors that even in the worst-case put me into Database team but not in Core Development.
I learned no one cared about what a person was really passionate about. Each one of us was very randomly allocated teams. With all the luck from the entire universe with me, I got an option to choose between the Spring framework or AngularJS. I have no hatred towards any of these frameworks, but it was not something meant for me to do.
And you don’t do what you don’t like
But as now I was employed and for which I was being paid for, I responsibly did all the stuff I was asked to do, in the office time. As soon as I came home, I had my DataCamp, KhanAcademy’s Statistics course, and not to forget the Job Portals all up and running. For the next two months, I was leading two lives. A Backend Developer, and a Data Science Jobseeker. Yes, I worked like hell, I got sick, still, I managed and this time I was becoming foundationally strong in Data Science.
Here’s the second part of my journey: My Journey To Getting A Data Science Job As A Fresher — Part 2: The Hustle
If you loved this story, do join our Telegram Community.
Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
Sahil Malhotra is a part of AIM Writers Programme. He is a Data Scientist and works at a Canada-based real estate tech company. He is good at interpreting and solving problems with the help of data and he's proficient in statistics and machine learning algorithms. He is currently working on applications and problems based on real estate and healthcare domains, leveraging sensor data, computer vision and NLP technologies.