MachineHack platform recently concluded its 11th hackathon \u201cPredict The Flight Ticket Price Hackathon\u201d. The top three winners on the leaderboard were Stavya Bhatia, Chetan Ambi and V Sreekiran Prasad. Analytics India Magazine talked to the winners to understand how they went about this hackathon and resolved this exciting problem. We also have a code of these winners uploaded on Github to help the readers understand the code that they used to solve the problem.\n\n1. Stavya Bhatia:\n\nData science journey:\n\nStavya Bhatia who has been a Sr. Analyst at Antuit India Pvt Ltd., a Goldman Sachs funded startup is a mechanical engineer from Manipal Institute of Technology. Admittedly, at first, he didn\u2019t know much about data science and machine learning but quickly transitioned after landing a job at Mu Sigma where he learnt several data science techniques and programming languages. He says, \u201cBut most importantly, it helped me inculcate a habit of relentless self-learning which motivated me to teach myself advanced analytics and machine learning in my free time on the weekends.\u201d He also said that courses on online popular MOOC platforms helped him understand modeling theory and platforms like Kaggle helped him practice what he learnt and develop the intuition necessary for problem-solving. Bhatia said, \u201cI think I was fortunate to have found myself a job in data science, as today, I see the world with a very different, analytical lens. My experiences in these three years have helped me build a 'can do' attitude and a \u2018solution-oriented\u2019 mindset both of which \u00a0have equipped me with the confidence to take on any challenge that is in store for me in the future.\u201d\n\nApproach to solve the problem:\n\nBhatia spent a considerable time researching about the aviation industry, the economics of airline prices, flight scheduling and dynamics of a flight network. Having done that, once he was confident that he understood the problem space and industry, he began noting down all the possible reasons that can affect prices. In order to maintain an unbiased perspective, he ensured that he does not look at the data. After exhaustively listing down the possible ideas that could help his model, he began exploring the data and separating implementable hypothesis from the non-implementable ones on the basis of the available dataset. He said that his learnings from Coursera, Udemy, and Kaggle came handy for testing the appropriateness of different modelling techniques for this hackathon. With the knowledge of the models and the list of hypothesis, he began testing his hypothesis by running them one by one and checking how his model performed. This involved iteratively writing code in Jupyter notebook which included data cleaning, manipulation, generation, model selection, individual model parameter tuning and finally ensemble modeling that helped him reach a score of .9569 on the leaderboard.\n\nHere is Stavya Bhatia\u2019s code on Github to help you see his approach more distinctly. \n\nExperience on MachineHack:\n\nTalking about his experience on the MachineHack platform, Bhatia said that his experience was that of 'Extreme Fun and Learning.' He said that he not only improved as a data scientist during his journey but made some valuable contacts on the way. Several individuals reached out to him to learn from his approach, and at the same time he reached out to several individuals to learn from their approach.\n\n2. Chetan Ambi:\n\nHe\u2019s currently working as Technology Lead at Infosys Ltd (Mysore) from about 5 years and have a total of 9 years of experience in the IT Industry. \u00a0\u00a0\n\nData science journey:\n\nCurrently working as a Technology Lead at Infosys Ltd, Ambi got inspired from Andrew Ng's Stanford Machine Learning lecture video on YouTube. He learnt a lot from his course on Coursera. Also, MOOC courses from Kirill Eremenko, Jose Portilla and LazyProgrammer helped him gain a good understanding of machine learning. He says that he is an avid reader of analytics portals and follows Kaggle. \u00a0\n\nApproach to solve the problem:\n\nAmbi first created a starter code without doing any feature engineering and that gave him a score of around 0.92. He converted the parameters of duration & Total_Stops to numeric. For 'Routes' column he applied TF-IDF. He next spent most of his time going through blogs and articles to understand more about factors that influence flight ticket prices, and applied that to his feature engineering. According to him, feature engineering helped him get best score on the leaderboard.\n\nHere are some of the features that he created during his model building, among which he only used some of them in the final model: \u00a0\n\n\n Days to Departure (no. of days remaining to travel)\n Booking Class (Economy, Premium Economy and Business)\n Market Share (Market Share of the Airlines)\n Departure time of the day (morning, noon, evening & night)\n Arrival time of the day (morning, noon, evening & night)\n Carrier Type: Low Cost Carrier or Full Service Airlines\n Travel Season: Dataset covered only 4 months Mar, April, May and June which falls under Spring and Summer\n Journey Day: It could be Monday to Sunday or weekday vs weekend, based on the knowledge about when the prices get affected most\n Holiday Season: For example if date of journey falls closer to Festivals or long weekends, it may affect the prices\n\n\nHe started with LightGBM which gave him a good CV and LB score. After trying other regression algorithms, he finally selected 4 models for next step which was Ensemble. Ambi\u2019s final solution is ensemble of LightGBM, XGBoost, Bagging Regressor and Gradient Boosting. \u00a0\n\nHere is his Chetan Ambi\u2019s code on Github to have an insight in his solution.\n\nExperience on MachineHack:\n\nHaving a total of 9 years of experience in the IT industry, he considers MachineHack to be a wonderful platform for everyone from beginners to experts to showcase their data science skills. Ambi has previously secured rank 3 on the MachioneHack hackathons of \u201cPredict The Data Scientists Salary In India Hackathon\u201d, \u201cPredict A Doctor\u2019s Consultation Fee Hackathon\u201d and \u201cWhose Line Is It Anyway: Identify The Author Hackathon\u201d. \u201cI am expecting more challenging problems in the future from Machinehack,\u201d he said. \u00a0\u00a0\u00a0\u00a0\n\n3. V Sreekiran Prasad:\n\nData Science journey:\n\nHaving done Production Engineering, Prasad, an alumnus of NIT Trichy has a non-coding. He said that data science was a completely new field for him. Things were difficult in the beginning but he learnt that logic itself is enough to solve the problem and coding is just a tool to apply the logic. During his final year internship, he got into the analytics field to help a start-up company that dealt with a problem by getting insights from the data collected over the years. This got him interested in analytics.\n\nApproach to solve the problem:\n\nAccording to Sreekiran, flight ticket prediction will become easier if we can figure out the perfect variables and apply the perfect machine learning algorithm with some good parameter tuning. With the given variables I considered the date of booking, airline and duration as the major role players in predicting the price of flights and since the data is distributed in future the prices cannot be compared with the old prices and previous information about the flights cannot be used.\n\nHere is the code of V Sreekiran Prasad on Github to help the readers with his solution.\n\nExperience on MachineHack:\n\nTalking about his experience on MachineHack he said that it is a good platform to learn and apply data science topics for intermediate machine learning enthusiasts. \u201cHackathons like these will build up confidence and gives exposure about different problem statements,\u201d he said.\n\nMachineHack recently launched two hackathons titled \u201cPredicting Restaurant Food Cost Hackathon\u201d and \u201cMaking Autonomous Vehicles Safer For Humans \u2013 Hackathon by Intel\u201d, to challenge data science enthusiasts with interesting problems.