MLOps follows a set of practices to deploy and maintain machine learning models in production efficiently and reliably. While the data science team has a deep understanding of the data, the operations team holds the business acumen. MLOps combines the expertise of each team, leveraging both data and operations skill sets to enhance ML efficiency.
According to the Algorithmia report, nearly 22 percent of companies have had ML models in production for one to two years.
With practice, MLOps professionals can enhance their skills, and develop a solid pipeline for developing machine learning models. In this article, we have shown projects across tools and services that will help you kickstart your MLOps journey on the go.
Developed by Goku Mohandas, ‘Made With ML’ is a project-based course on machine learning and MLOps fundamentals, focusing on intuition and application that teaches you how to apply machine learning across industries.
According to the developer (Hamza Tahir), it is supposed to be fast, easy and developer-friendly. However, it is by no means meant to be used in a full-fledged production-ready setup. Instead, it is simply a means to get a server up and running as fast as possible with the lowest costs possible.
In this project, you will learn how to deploy an ML inference service on a budget in less than ten lines of code. It is perfect for practitioners who want to deploy their models to an endpoint faster and not waste a lot of time, money, and effort trying to figure out end-to-end deployment.
The source code, alongside key features of this project, is available on GitHub.
Great Expectations helps data teams eliminate pipeline debt through data testing, documentation, and profiling. It is a flexible, declarative syntax for describing the expected shape of data.
When used in exploration and development, Great Expectations provides an excellent medium for communication, surfacing and documenting latent knowledge about the shape, format, and content of data. In production, it is a powerful tool for testing.
Check out the GitHub repository here.
Lime supports explaining individual predictions for text classifiers or classifiers that act on tables (NumPy arrays of numerical or categorical data) or images.
It is based on the work presented in the ‘Why Should I Trust You?: Explaining the Predictions of Any Classifier‘ paper.
Check out the GitHub repository here.
ML automation workflow contains a Python-based machine learning project to demonstrate the archetypal ML workflow within a Jupyter notebook, alongside some proof-of-concept ideas on automating key steps, using the Titanic binary classification dataset hosted on Kaggle.
The secondary aim of this ‘project’ is to show how the deployment of the model generated as a ‘build artefact’ of the modelling notebook can be automatically deployed as a managed RESTful prediction service on Kubernetes without having to write any custom code.
Check out the GitHub repository of this project here.
It is a generic template for building end-to-end machine learning projects. It offers a logical, reasonably standardised, but flexible project structure for doing and sharing machine learning work.
The source code is available on GitHub.
It is a collection of community projects to build new components, examples, libraries, and tools for TFX (TensorFlow Extended). The projects are organised under a special interest group, called SIG TFX-Addons. The group focuses on:
- Driving the development of high-quality ‘custom pipeline components,’ including container-based components, Python function-based components, etc.
- Shaping a standardised set of descriptive metadata for community-contributed components to enable easy understanding, comparison, and sharing of components during discovery.
- Enabling the development of templates, libraries, visualisations, and other useful additions to TFX.
Check out TFX-Addons projects here.
Amazon SageMaker Examples demonstrate how to build, train, and deploy machine learning models using Amazon SageMaker. ‘Amazon SageMaker‘ is a fully managed service for data science and ML workflow. In this project, you will learn to quickly set up and run notebooks.
The project comprises:
- Introduction to ground truth labeling jobs
- Introduction to applying machine learning
- SageMaker automatic model tuning (XGBoost Tuning, TensorFlow Tuning, MXNet Tuning, etc.)
- Introduction to Amazon algorithms (k-means, Factorization Machines, Latent Dirichlet Allocation (LDA), Linear Learner, Image Classification, etc.)
- Amazon SageMaker reinforcement learning (RL) (Cartpole using Coach, AWS DeepRacer, Knapsack Problem, etc.)
- Scientific details of algorithms (Streaming Median, LDA, Linear Learner features)
- Amazon SageMaker debugger
- Advanced Amazon SageMaker functionality (data distribution types, encrypting your data, connecting to redshift, etc.)
- Amazon SageMaker Neo compilation jobs (GluonCV SSD Mobilenet, MNIST with MXNet, etc.)
- Amazon SageMaker Processing (Scikit-Learn data processing and model evaluation, feature transformation with Amazon SageMaker Processing and SparkML, etc.)
- Amazon SageMaker pre-built framework containers and the Python SDK
- Using Amazon SageMaker with Apache Spark
- AWS Marketplace (create algorithm/model package for listing in Marketplace for machine learning)
Check out more MLOps open source projects here.
Join Our Telegram Group. Be part of an engaging online community. Join Here.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Amit Raja Naik is a senior writer at Analytics India Magazine, where he dives deep into the latest technology innovations. He is also a professional bass player.