Now Reading
How to Crack the Capstone Project in Analytics Learning

How to Crack the Capstone Project in Analytics Learning

Sarita Digumarti

Analytics is one of those fields where ‘learning’ best happens through ‘doing’. One can attend as many courses or classes on R or Python but there is no substitute for actually working on the R console or writing Python code to manipulate and analyze data. Similarly, it is important to get the experience of working on ‘unclean’ data sets – data sets that have missing values, wrong values, incomplete information and other inconsistencies. This is how real-life data will be, may as well get used to it.

A Capstone project, therefore, is an important part of any long-term analytics course. In the Executive Program in Business Analytics (EPBA) that we run jointly with SDA Bocconi, we take special care to ensure all our students get a flavor of real-world problems with the Capstone project. We have a dedicated team that works to ensure that there are enough projects, that the projects have enough substance and that there are industry mentors for each project.

There is a lot of effort that goes in from our side to make the Capstone project a strong learning experience for every student. However, there are many things that you as a student can also do to make sure that you get the most out of this unique opportunity to work on real business problems.

What can a student do to maximize learning from a Capstone project?

Choose your topic with care

While every new analytics project would be interesting, try to choose something relevant. Relevant, in this context, means something that at least one of your group members has some experience with. It could be the industry, the type of problem or even the kind of data you will be looking at. For example, if one of you comes from a telecom background and your project is on predicting customer attrition in the telecom industry, then you can leverage the telecom expertise you have internally to come up with a far more insightful analysis than if you have no experience of the industry.

Keep your objective extremely focused

Remember that you have limited time for a Capstone project. Keep your objective focus razor-sharp. It is better to do a little but do it well rather than do a poor job of trying to do too many things with your data.

Look at this as an opportunity to learn something new

Try to use the project as an opportunity to learn something new. For example, you may know R but have never used machine learning algorithms in R. A Capstone project is a good time to learn this.

Learn about the industry

Students often start the analysis straight away without spending enough time to understand the context of the problem. In real life, data scientists will spend tons of time understanding any new industry. They would have multiple calls/meetings with the client to understand the context of the problem. You should follow the same strategy for a Capstone project.

Engage with the SME

Make sure you spend enough time with your project mentor. The mentor will provide you with all the information to start with and then guide you through the project. The mentor will have valuable experience and knowledge about your project and the more you can get out of them, the better your final results would be.

Come up with practical recommendations

Nothing irritates a client more than half baked recommendations to solve the problem at hand. I have heard students recommend “Open business in China and Singapore in the next 3 months” or “we recommend closing down of 5 of the 8 product lines to improve profitability”.  Such recommendations show a naivety towards business reality and takes away from the seriousness of your effort.

End with what else is possible

Always end with what more could be done. Both you and the client know that you have very limited time in the Capstone project. If you spend time to chalk out what else could be done in the future (by you or someone else), it shows foresight. I have seen that clients really appreciate if they can see a clear path moving forward at the end of the capstone project.

These are some tips for anyone working on an analytics capstone project.

Here are some of the projects by the EPBA students that were well appreciated by the clients.

Project 1:

Provide data-based insights to the sales team about the prospective buyers’ propensity to buy the solutions of the client so they can offer the right product to the right customer at the right price.

Techniques used:  User based collaborative filtering, distributed random forests

Tools used: R

What made this a success: Client had never used collaborative filtering and was very impressed by the power of this technique. 

Project 2:

To create a credit scoring engine which will help a peer to peer lending company evaluate hundreds of thousands of applications in a fraction of the current time with better accuracy.

Techniques used: Neural networks, Random forest, support vector machines

Tools used: R, Python, WPS, Spark

What made this a success: The team used this opportunity to do comparative evaluation of different machine learning techniques in credit scoring. They are working on developing this into a white paper.

Project 3:

Improve the viewership of a news site by offering better reading recommendations to its audience via a more powerful recommendation engine.

Techniques used: 2 techniques were used and compared – collaborative

See Also
Startup Funding

Filtering and content based recommendations

Tools used: R

What made this a success: Client thought that collaborative filtering will be a far superior approach but the results showed that till they hit a certain audience volume, content based recommendations will work better.

Project 4:

Analysis of shopper behavior for a UK based food chain

Techniques used: Multinomial regression and cluster analysis

Tools used: HIVE, HQL, R, Python

What made this a success: None of the guys had worked on Python before. Yet, a lot of the project was done on Python and all the participants walked away with a good knowledge of the language.

Project 5:

To craft an approach that effectively classifies tweets and captures user sentiments about a large e-commerce company in India

Techniques used: Multinomial regression and cluster analysis

Tools used: R, Tableau, KNIME, Excel

What made this a success: The team had to work with limited data and the analysis got over fairly quickly. The team spent additional time on creating impressive visualizations with Tableau. The work was highly appreciated by the client. The visualizations were even showcased at the next senior leadership meet.

What Do You Think?

If you loved this story, do join our Telegram Community.

Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top