Advances in machine learning and data science have led to the creation of new branches. The new specialisations are often rooted in the same basic principles and have overlapping functionalities. For example, MLOps and DevOps. In this article, we discuss why the two are different and cannot be used interchangeably.
DevOps: Development + operations
DevOps has emerged as a mainstream strategy for major companies worldwide, including Accenture, Tech Mahindra, and Oracle. DevOps is the amalgamation of development and operations.
In the usual scenario, many teams developing software typically spend more time on deployment than developing new features. This is where DevOps comes in handy, where it reduces the barrier between Development and Operations, thereby increasing development autonomy. It enables communication between the software developers and IT for faster delivery of software processes.
In some cases, quality assurance and security teams are also tightly integrated with development and operations teams. When security is the focus of the DevOps team, it is referred to as DevSecOps. The teams automate processes to perform manual tasks.
MLOps: Offshoot of DevOps
Recently, Andrew Ng spoke about how the machine learning community can leverage MLOps tools to make high-quality datasets and AI systems that are repeatable and systematic. He called for shifting the focus from model-centric to data-centric machine learning development. Andrew also said that, going forward, MLOps can play an important role in ensuring a high-quality and consistent flow of data throughout all the stages of the project.
Up to 90 percent of the total data that we have today was generated within the last few years. Big Data helps in developing actionable insights but at the same time poses a few challenges such as: acquisition and cleaning of large data; tracking and versioning for models; deployment of monitoring pipelines for production; scaling machine learning operations.
MLOps approach can help in solving these challenges. MLOps is the union of DevOps, machine learning, and data engineering. Built on DevOps’ existing approach, MLOps solutions are developed to reduce waste, facilitate automation, and extract richer and consistent insights in a machine learning project.
- MLOps helps in building an efficient machine learning strategy for a business by combining the business knowledge of an organisation’s operation team with the data science team’s expertise to drive maximum benefit.
- MLOps automates model development and deployment. This helps in faster release and lower operational costs, resulting in business agility and faster decision making.
- MLOps puts the operation team at the forefront of the regulatory process. This is important because insights gained from the data will hold no ground if one disregards the standard practices and the regulations.
- MLOps facilitates the collaboration between the operations and the data team to optimise labour division.
- The key phases of MLOps include–data gathering, data analysis, data preparation and transformation, model training and development, model serving, model monitoring and model retraining.
- MLOps allows experimentation with different settings.
“I believe the hype is real. There is an increasing demand for people who have experience in model lifecycle management and model deployments etc. This is slightly different from the need for data scientists or data engineers; both are still required for full analytics capability in a team,” said Nikhil Dhawan, Director of Engineering, MLOps at Dentsu International in an earlier interview with Analytics India Magazine.
DevOps vs MLOps
MLOps is an offshoot of DevOps. In MLOps, the DevOps principles and workflows are applied to machine learning operations. It implements pipelines and automation for the smooth flow of training operations and the integration of final models into software products.
When it comes to testing, MLOps require additional methods on top of what is done for DevOps and DevSecOps. It may require steps such as data validation, model validation, and model quality testing. When it comes to deployment, depending on the type of ML model, the developer needs to set up pipelines for ongoing data handling and training; this requires multi-step pipelines to handle retraining steps, verification, and redeployment processes.
Another area where MLOps is different from DevOps is in how CD/CI pipelines are constructed. In MLOps, CI components are extended to testing and validation processes; CD components support the deployment of training pipelines and the final model prediction. In addition to this, there is another component called continuous testing (CT) for automatic model retraining and refinement.