Active Hackathon

Microsoft open-sources distributed ML library SynapseML

SynapseML runs on Apache Spark, provides a language-agnostic API abstraction over several datastores, and integrates with several existing ML technologies, including Open Neural Network Exchange (ONNX).

Microsoft released SynapseML, an open-source library for creating and managing distributed ML pipelines, software engineer Mark Hamilton announced in his blog.

SynapseML runs on Apache Spark and takes advantage of Spark’s large-scale fault-tolerant compute clusters management. The library has APIs for Python as well as Java, with the ability to generate bindings for Java, R, and C#. 


Sign up for your weekly dose of what's up in emerging technology.

In addition, it includes the HTTP on Spark module, allowing users efficient integration of web services into their pipelines and pre-built wrappers for invoking several such services, including Azure Cognitive Services

To perform distributed inference on Spark, using ONNX, developers can deploy pre-trained models from Microsoft’s ONNX Model Hub or convert models built in other frameworks like TensorFlow or PyTorch.The Spark Serving module allows developers to expose their Spark pipelines as low-latency web services.

Hamilton, in his blog, said, “Our goal is to free developers from the hassle of worrying about the distributed implementation details and enable them to deploy them into a variety of databases, clusters, and languages without needing to change their code.”

SynapseML also includes tools for responsible AI, such as data balance analysis and model explainability. The library includes support for AutoML features, such as finding the best-performing model using hyperparameter search and Spark-native implementation of several models, including an anomaly-detection model for cyber security; an isolation forest model, which performs nonlinear outlier detection; and a conditional k-nearest-neighbour model.

More Great AIM Stories

Poornima Nataraj
Poornima Nataraj has worked in the mainstream media as a journalist for 12 years, she is always eager to learn anything new and evolving. Witnessing a revolution in the world of Analytics, she thinks she is in the right place at the right time.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM