MITB Banner

GitHub Open-Sources A Series Of GitHub Actions For Automating ML Workflow

Share

Recently, GitHub announced that now developers can use GitHub Actions for Machine learning Operations (MLOps) and Data Science. The software development platform created a series of GitHub Actions that integrate parts of data science and machine learning with a software development workflow.

MLOps is a practice for collaborating between data scientists and operations professionals for testing, lineage, versioning, and historical information in an automated way in order to manage machine learning or deep learning production life-cycle. 

Due to the nascent stage of MLOps, developers and data scientists often require to implement these tools from scratch, use disparate tools that are decoupled from codes and thus leading to poor debugging and reproducibility. In order to mitigate these issues, a series of GitHub Actions have been introduced.

There are currently a number of GitHub Actions that are available for MLOps and data science. Some of these are mentioned below: –

Orchestrating Machine Learning Pipelines:

  • Submit Argo Workflows – The Submit Argo Workflows allows a developer to orchestrate machine learning pipelines that run on Kubernetes.
  • Publish Kubeflow Pipelines to GKE– Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers. The goal of this action is to provide automated deployments of Kubeflow Pipelines on Google Cloud Platform (GCP). 

Jupyter Notebooks:

  • Run Parameterised Notebooks– This GitHub action runs a Jupyter notebook,  parameterises using papermill and lets a developer upload produced output as an artifact using the upload artifact action.
  • Repo2Docker Action– This action helps to build a Jupyter enabled Docker image from a GitHub repository and push this image to a Docker registry of choice.
  • fastpages– fastpages uses GitHub Actions to simplify the process of creating Jekyll blog posts on GitHub Pages from a variety of input formats. The features of this action include collapsable code cells that are either open or closed by default, ability to add links to Colab and GitHub automatically, built-in search, create posts, including formatting and images, directly from Microsoft Word documents and other such.

End-To-End Workflow Orchestration:

  • Examples and templates for utilising Azure Machine Learning from GitHub Actions. The templates show the extensive capabilities of using GitHub Actions combining with Azure Machine Learning. It helps in managing a machine learning project with automated training and deployment.

Experiment Tracking:

  • Fetch runs from Weights & Biases– The Weights and Biases is an experiment tracking and logging system for machine learning and is free for open-source projects. 

In a blog post, Hamel Hussain, a machine learning engineer at the code hosting platform illustrated how developers and data scientists can easily orchestrate a machine learning pipeline to run on the infrastructure as well as how an experiment tracking system can be integrated with GitHub Actions to enable MLOps.

Wrapping Up

Last year in November, the code hosting platform announced the launch of GitHub Actions and Packages, which makes it easy for the developers to automate all the software workflows. After the series of GitHub Actions, the software development platform also announced GitHub Super Linter

The Super Linter is basically a source code repository that is wrapped up into a Docker container and is called by the GitHub Actions in order to maintain consistency in the documentation and code while making more productive communication and collaboration for developers. 

The intuitive features of Super Linter include 

  • Super linter prevents any broken code from being uploaded to the master branches
  • It assists in establishing coding best practices across various programming languages
  • The source code repo builds guidelines for code layout and format
  • It automates the process to help streamline the code reviews
  • With the basic criteria of this repo, developers will be able to ship better, cleaner and more stable codes internally as well as to the customers and partners.
PS: The story was written using a keyboard.
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
Recent Stories

Featured

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

AIM Conference Calendar

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives. Revel in intimate events that encapsulate the heart and soul of the AI Industry.

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed