Machine learning is the hottest topic in the industry. Therefore, they are one of the highest-paid professionals in the industry. ML and its services are only going to extend their influence and push the boundaries to new realms of the technology revolution. However, deploying ML comes with great responsibility. The black box modeling, though is shedding off its black box reputation, it is crucial to establish trust in both in-house teams and stakeholders.
This can be done by practising a few routines that have been tested at the heart of Google AI research departments. Here are a few best practices, which can help ML engineers in a hassle-free model building:
It’s Okay To Have A Simple Model
First impressions last. So, pick a model that is simple to avoid infrastructure issues. Before exporting your fancy new machine learning system, it is important to determine how to get examples to your learning algorithm. A simple model provides the team with baseline metrics and behaviour that one can use to test more complex models.
Keep The Infrastructure Testable
Machine learning has an element of uncertainty, so ensure to tests the code for creating examples in training and serving. To keep the infrastructure issues in check:
- Test getting data into the algorithm and if possible, check statistics in the pipeline in comparison to statistics for the same data processed elsewhere
- Test getting models out of the training algorithm and make sure that the model in the training environment delivers the same score as the model in serving environment
Unused features create technical debt and combining it with other features is not working, then drop it out of the infrastructure.
Check The Freshness Of Your Model
Experts suggest monitoring the model for degradation in quality with passing time. If the model is not updated for a day and the quality goes down, then round the clock engineering service is necessary. For instance, if the ML model for Google Play Search is not updated, it can negatively impact within a month.
Don’t Export Models In A Hurry
If the model’s performance is not reasonable on held-out data, then it is important to run a sanity check before exporting and serving the model to the customer. It is good practice to check the area under the ROC curve before exporting.
Stick To Simple Metrics Initially
There are many metrics to evaluate the model’s performance and with such abundance, engineers can end up chasing their tail while choosing metrics. It is advisable to stick to something simple that satisfies the first objective.
The ML objective should be something that is easy to measure and is a proxy for the “true” objective
Indirect effects make great metrics that can be used during A/B testing and during launch decisions.
Keep Models Interpretable
If predictions are interpretable, it becomes easier to debug. This is true for models that use objectives (zero-one loss, various hinge losses, and so on) that try to directly optimise classification accuracy or ranking performance.
Launch Models Regularly
There are three basic reasons to launch new models:
- You are coming up with new features
- You are tuning regularisation and combining old features in new ways
- You are tuning the objective
It is essential to think about how easy it is to add or remove or recombine features and if it is easy to create a fresh copy of the pipeline to verify its correctness. Launching models in this way regularly can keep the quality consistent.
Having Specific Features Is Good
With a plethora of data, it is simpler to learn millions of simple features than a few complex features. For generalisation, it is better to have groups of features, where each feature applies to a very small fraction of data.
Experts at Google, insist everyone to reuse code between training pipeline and serving pipeline whenever possible. And, try not to use two different programming languages between training and serving. That decision will make it nearly impossible to share code.
Keep Ensembles Simple
An ensemble of models is a “model” which combines the scores of other models to perform better.
To keep things simple, each model should either be an ensemble only taking the input of other models, or a base model taking many features, but not both.
If models on top of other models are trained separately, then combining them can be inefficient.
Use a simple model for ensembling that takes only the output of the “base” models as inputs. And it is good if the incoming models are semantically interpretable so that changes of the underlying models do not confuse the ensemble model.
Most of the problems are, in fact, engineering problems and the above-mentioned tips are one of the many principles that one can refer to while setting up ML pipelines from scratch.