Machine learning is the hottest topic in the industry. Therefore, they are one of the highest-paid professionals in the industry. ML and its services are only going to extend their influence and push the boundaries to new realms of the technology revolution. However, deploying ML comes with great responsibility. The black box modeling, though is shedding off its black box reputation, it is crucial to establish trust in both in-house teams and stakeholders.

This can be done by practising a few routines that have been tested at the heart of Google AI research departments. Here are a few best practices, which can help ML engineers in a hassle-free model building:

It’s Okay To Have A Simple Model

First impressions last. So, pick a model that is simple to avoid infrastructure issues. Before exporting your fancy new machine learning system, it is important to determine how to get examples to your learning algorithm. A simple model provides the team with baseline metrics and behaviour that one can use to test more complex models. 

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Keep The Infrastructure Testable

Machine learning has an element of uncertainty, so ensure to tests the code for creating examples in training and serving. To keep the infrastructure issues in check:

  1. Test getting data into the algorithm and if possible, check statistics in the pipeline in comparison to statistics for the same data processed elsewhere
  2. Test getting models out of the training algorithm and make sure that the model in the training environment delivers the same score as the model in serving environment 

Unused features create technical debt and combining it with other features is not working, then drop it out of the infrastructure.

Download our Mobile App

Check The Freshness Of Your Model

Experts suggest monitoring the model for degradation in quality with passing time. If the model is not updated for a day and the quality goes down, then round the clock engineering service is necessary. For instance, if the ML model for Google Play Search is not updated, it can negatively impact within a month.

Don’t Export Models In A Hurry 

If the model’s performance is not reasonable on held-out data, then it is important to run a sanity check before exporting and serving the model to the customer. It is good practice to check the area under the ROC curve before exporting. 

Stick To Simple Metrics Initially

There are many metrics to evaluate the model’s performance and with such abundance, engineers can end up chasing their tail while choosing metrics. It is advisable to stick to something simple that satisfies the first objective.

The ML objective should be something that is easy to measure and is a proxy for the “true” objective

Indirect effects make great metrics that can be used during A/B testing and during launch decisions.

Keep Models Interpretable

If predictions are interpretable, it becomes easier to debug. This is true for models that use objectives (zero­-one loss, various hinge losses, and so on) that try to directly optimise classification accuracy or ranking performance. 

Launch Models Regularly

There are three basic reasons to launch new models:

It is essential to think about how easy it is to add or remove or recombine features and if it is easy to create a fresh copy of the pipeline to verify its correctness. Launching models in this way regularly can keep the quality consistent.

Having Specific Features Is Good

With a plethora of data, it is simpler to learn millions of simple features than a few complex features. For generalisation, it is better to have groups of features, where each feature applies to a very small fraction of data.

Reuse Code

Experts at Google, insist everyone to reuse code between training pipeline and serving pipeline whenever possible. And, try not to use two different programming languages between training and serving. That decision will make it nearly impossible to share code.

Keep Ensembles Simple

An ensemble of models is a “model” which combines the scores of other models to perform better. 

To keep things simple, each model should either be an ensemble only taking the input of other models, or a base model taking many features, but not both.

If models on top of other models are trained separately, then combining them can be inefficient.

Use a simple model for ensembling that takes only the output of the “base” models as inputs. And it is good if the incoming models are semantically interpretable so that changes of the underlying models do not confuse the ensemble model. 

Most of the problems are, in fact, engineering problems and the above-mentioned tips are one of the many principles that one can refer to while setting up ML pipelines from scratch.