Now Reading
See How A Sr. Analyst, A Tech Lead & Business Analyst Solved MachineHack’s “Predict The Flight Ticket Price” Hackathon


See How A Sr. Analyst, A Tech Lead & Business Analyst Solved MachineHack’s “Predict The Flight Ticket Price” Hackathon


MachineHack platform recently concluded its 11th hackathon “Predict The Flight Ticket Price Hackathon”. The top three winners on the leaderboard were Stavya Bhatia, Chetan Ambi and V Sreekiran Prasad. Analytics India Magazine talked to the winners to understand how they went about this hackathon and resolved this exciting problem. We also have a code of these winners uploaded on Github to help the readers understand the code that they used to solve the problem.



1. Stavya Bhatia:

Data science journey:

Stavya Bhatia who has been a Sr. Analyst at Antuit India Pvt Ltd., a Goldman Sachs funded startup is a mechanical engineer from Manipal Institute of Technology. Admittedly, at first, he didn’t know much about data science and machine learning but quickly transitioned after landing a job at Mu Sigma where he learnt several data science techniques and programming languages. He says, “But most importantly, it helped me inculcate a habit of relentless self-learning which motivated me to teach myself advanced analytics and machine learning in my free time on the weekends.” He also said that courses on online popular MOOC platforms helped him understand modeling theory and platforms like Kaggle helped him practice what he learnt and develop the intuition necessary for problem-solving. Bhatia said, “I think I was fortunate to have found myself a job in data science, as today, I see the world with a very different, analytical lens. My experiences in these three years have helped me build a 'can do' attitude and a ‘solution-oriented’ mindset both of which  have equipped me with the confidence to take on any challenge that is in store for me in the future.”

Approach to solve the problem:

Bhatia spent a considerable time researching about the aviation industry, the economics of airline prices, flight scheduling and dynamics of a flight network. Having done that, once he was confident that he understood the problem space and industry, he began noting down all the possible reasons that can affect prices. In order to maintain an unbiased perspective, he ensured that he does not look at the data. After exhaustively listing down the possible ideas that could help his model, he began exploring the data and separating implementable hypothesis from the non-implementable ones on the basis of the available dataset. He said that his learnings from Coursera, Udemy, and Kaggle came handy for testing the appropriateness of different modelling techniques for this hackathon. With the knowledge of the models and the list of hypothesis, he began testing his hypothesis by running them one by one and checking how his model performed. This involved iteratively writing code in Jupyter notebook which included data cleaning, manipulation, generation, model selection, individual model parameter tuning and finally ensemble modeling that helped him reach a score of .9569 on the leaderboard.

Here is Stavya Bhatia’s code on Github to help you see his approach more distinctly.

Experience on MachineHack:

Talking about his experience on the MachineHack platform, Bhatia said that his experience was that of 'Extreme Fun and Learning.' He said that he not only improved as a data scientist during his journey but made some valuable contacts on the way. Several individuals reached out to him to learn from his approach, and at the same time he reached out to several individuals to learn from their approach.

2. Chetan Ambi:

He’s currently working as Technology Lead at Infosys Ltd (Mysore) from about 5 years and have a total of 9 years of experience in the IT Industry.   

Data science journey:

Currently working as a Technology Lead at Infosys Ltd, Ambi got inspired from Andrew Ng's Stanford Machine Learning lecture video on YouTube. He learnt a lot from his course on Coursera. Also, MOOC courses from Kirill Eremenko, Jose Portilla and LazyProgrammer helped him gain a good understanding of machine learning. He says that he is an avid reader of analytics portals and follows Kaggle.  

Approach to solve the problem:

Ambi first created a starter code without doing any feature engineering and that gave him a score of around 0.92. He converted the parameters of duration & Total_Stops to numeric. For 'Routes' column he applied TF-IDF. He next spent most of his time going through blogs and articles to understand more about factors that influence flight ticket prices, and applied that to his feature engineering. According to him, feature engineering helped him get best score on the leaderboard.

Here are some of the features that he created during his model building, among which he only used some of them in the final model:  

  • Days to Departure (no. of days remaining to travel)
  • Booking Class (Economy, Premium Economy and Business)
  • Market Share (Market Share of the Airlines)
  • Departure time of the day (morning, noon, evening & night)
  • Arrival time of the day (morning, noon, evening & night)
  • Carrier Type: Low Cost Carrier or Full Service Airlines
  • Travel Season: Dataset covered only 4 months Mar, April, May and June which falls under Spring and Summer
  • Journey Day: It could be Monday to Sunday or weekday vs weekend, based on the knowledge about when the prices get affected most
  • Holiday Season: For example if date of journey falls closer to Festivals or long weekends, it may affect the prices

He started with LightGBM which gave him a good CV and LB score. After trying other regression algorithms, he finally selected 4 models for next step which was Ensemble. Ambi’s final solution is ensemble of LightGBM, XGBoost, Bagging Regressor and Gradient Boosting.  

Here is his Chetan Ambi’s code on Github to have an insight in his solution.

Experience on MachineHack:

Having a total of 9 years of experience in the IT industry, he considers MachineHack to be a wonderful platform for everyone from beginners to experts to showcase their data science skills. Ambi has previously secured rank 3 on the MachioneHack hackathons of “Predict The Data Scientists Salary In India Hackathon”, “Predict A Doctor’s Consultation Fee Hackathon” and “Whose Line Is It Anyway: Identify The Author Hackathon”. “I am expecting more challenging problems in the future from Machinehack,” he said.     

See Also

3. V Sreekiran Prasad:

Data Science journey:

Having done Production Engineering, Prasad, an alumnus of NIT Trichy has a non-coding. He said that data science was a completely new field for him. Things were difficult in the beginning but he learnt that logic itself is enough to solve the problem and coding is just a tool to apply the logic. During his final year internship, he got into the analytics field to help a start-up company that dealt with a problem by getting insights from the data collected over the years. This got him interested in analytics.

Approach to solve the problem:

According to Sreekiran, flight ticket prediction will become easier if we can figure out the perfect variables and apply the perfect machine learning algorithm with some good parameter tuning. With the given variables I considered the date of booking, airline and duration as the major role players in predicting the price of flights and since the data is distributed in future the prices cannot be compared with the old prices and previous information about the flights cannot be used.

Here is the code of V Sreekiran Prasad on Github to help the readers with his solution.

Experience on MachineHack:

Talking about his experience on MachineHack he said that it is a good platform to learn and apply data science topics for intermediate machine learning enthusiasts. “Hackathons like these will build up confidence and gives exposure about different problem statements,” he said.

MachineHack recently launched two hackathons titled “Predicting Restaurant Food Cost Hackathon” and “Making Autonomous Vehicles Safer For Humans – Hackathon by Intel”, to challenge data science enthusiasts with interesting problems.



Register for our upcoming events:


Enjoyed this story? Join our Telegram group. And be part of an engaging community.


Our annual ranking of Artificial Intelligence Programs in India for 2019 is out. Check here.

Provide your comments below

comments

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
Scroll To Top