“DevOps is not a product that you can buy and install.”Pulkit Agarwal, GitHub.
After productive and informative Day 1, ADasSci’s Deep Learning Developers Conference is live again. Day 2 of DLDC2020 too, had an interesting lineup of speakers along with a full-day workshop on deep learning with Keras. In an hour-long talk, speakers Pulkit Agarwal and Vinod Joshi of Github discussed the various challenges of setting up an ML pipeline.
Pulkit, who is part of the product team at Github, began by defining what MLOps is really about and what makes it challenging while organisations have figured out how to work with DevOps.
MLOps comes with an additional challenge of machine learning lifecycle automation. Usually, more emphasis is placed on models, but Pulkit likened the model-building to a small cog in the wheel. For instance, small systems are not sufficient for remote training. VMs or Spark clusters are essential. Pulkit listed four key challenges one might face while setting up an ML pipeline:
- Collaboration on code
- Remote training
- Model Bookkeeping
- Managing data code and updates.
Model bookkeeping, for example, can cost a project dearly. Developers can lose track of file versions, and deployment becomes chaos. There can be other instances where someone doesn’t know how to write a controller file. Organisations might run into this trivial-sounding yet serious problem sooner or later if attention is not paid to the details.
So how does GitHub get MLOps right? Although Pulkit admits that “easy” in MLOps is a very ambitious goal, the team at GitHub tries their best by incorporating three important components:
- ML Optimised compute
- Source control and
- ML Aware
For example, the job of ML Aware CI/CD component is to warn the system in case of code change or other updates. While the first half of the talk included how GitHub made MLOps easy-ish, the second half, helmed by Vinod Joshi, was about how these principles were put to use in building models for increasing productivity of the developers. Vinod elaborated about the various aspects of ML lifecycle and the importance of building and rebuilding models when there is any change in the data distribution.
Vinod continued his talk by dissecting a use case where he and his team have worked on a model that tracks the coding time of the developers. The whole process can be looked at through the lens of a Markov process where coding and non-coding are the states between which the observations or commits, in this case, are made. Due to the many hidden states, this becomes more of a hidden Markov model.
So what are the implications of such an experiment?
The notion here was to identify the patterns between commit intervals and productivity. In large organisations, continued Vinod, developers don’t get enough time to code due to various other activities like meetings etc. For a developer, time spent on coding reflects their job satisfaction. So the insights gathered from this model can have multiple applications within the organisation. Another key application can be identifying the right time code review. If a developer’s coding time is tracked and if peak working time is figured out, then they would give the code for review within those time zones when they are productive.
Team GitHub, in this talk, gave us a glimpse of what it takes to make MLOps easy for an organisation. According to Pulkit, MLOps or DevOps is not just any software but more of a value, a union of people and processes. DevOps is not just any product that one buys and installs. This sums up the ethos that underlies GitHub’s success with ML.