CRM Software Giant Salesforce Joins Homegrown AutoML Market With Its TransmogrifAI

Open sourcing machine learning tools are the norm in the tech world. Salesforce is the latest in the line of tech firms to open source its machine learning software TransmogrifAI which helps build machine learning systems at an enterprise scale. Shubha Nabar, senior director, Data Science at Salesforce Einstein, revealed in a post that the diversity of data and use cases at enterprise companies makes machine learning for enterprise products a big challenge.

In other words, every use case necessitates the need to build customer-specific ML models. It isn’t, however, possible to build and deploy thousands of personalised ML models trained on each individual customer’s data for every single use case. This is the same library built on Scala and SparkML that is used to power Einstein AI platform.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

At a time when IT giants are rushing to reshape Enterprise ML with homegrown autoML libraries, how will Salesforce’s TransmogrifAI change the machine learning landscape? Now, automated ML solutions usually automate a few or all of the steps of the ML process. Some of these steps are data preprocessing or cleansing, feature engineering, feature extraction, feature selection and hyperparameter optimisation or algorithm selection.

Salesforce is not the first tech giant to release an AutoML tool. While Google has the first mover advantage by delivering on its promise with Google Cloud AutoML, another tech giant that has provided automated ML tools like auto classifier is IBM’s SPSS, one of the most widely-used analytics tool in the market. Other AutoML tools include Auto WEKA for automatic model selection and hyperparameter optimisation and OptiML for automatic model optimisation. But Salesforce takes the lead in the end-to-end automation of the ML process.

Download our Mobile App

Let Us Encapsulate The High Points Of This Tool:

Image Source: Salesforce
  • Firstly, TransmogrifAI is an AutoML library for building modular ML workflows on Spark that require minimal hand tuning.
  • TransmogrifAI, written in Scala runs on top of Apache Spark is an automated ML library that simplifies the selection and model training for structured data. As Nabar puts it, most AutoML solutions today are focused on narrow tasks, or are built for unstructured data such as voice, image and text.
  • TransmogrifAI builds models at scale for structured heterogeneous data and is billed to perform the key components of ML process — data cleansing, feature selection and model training in three lines of code.
  • In a few lines of code, a data scientist can automate key tasks like data cleansing, feature engineering, and model selection to arrive at the right model which can be iterated further.
  • According to Mayukh Bhaowal from Salesforce, since 90 percent of the time is spent building models goes into creating the perfect numeric matrix of features to feed into the chosen algorithm, data scientists are required to reinvent the wheel every time. With this tool, data scientists can automatically engineer features based on the type of feature, data distribution and association with the response variable.
  • Another key feature of the tool is the model explainability which takes away the black box issue associated with ML. Nabar emphasises that from a trust and data point of view, this model isn’t a black box.
  • Salesforce has pitched this as a collaborative effort in building large-scale customer-specific ML models. The launch of the Spark-based ML framework came a day after Oracle open sourced its tool GraphPipe, a tool for deploying ML models on frameworks like Google’s TensorFlow and Facebook’s Caffe2.

AutoML, The Next Step In Democratising ML

Of late, the success of deep learning in automating tasks like image recognition and speech recognition has been achieved largely due to the automation of feature engineering process, where hierarchical feature extractors are learned from data, rather than being manually designed. Researchers from Bosch Centre for AI point out in their paper that the process of automating architecture engineering is a logical next step in automating ML.

This is why we see top companies like Google, Amazon, Microsoft and Salesforce open sourcing tools to enable data scientists to deploy models with minimum of hand-tuning and reducing the turnaround time. While Google has a lead in democratising AI with tools to enable developers to build AI at scale, Salesforce tackles the structured data challenge wherein there is a range of use cases where organisations require a vast amount of data to predict sales forecasts, conversions and customer churn.

Advantages Of Automated ML Tools

While AutoML solutions improve business outcomes significantly, reduce the turnaround time exponentially and also improve accuracy, there are certain disadvantages as well. The market for automated ML tools is increasing and will grow stronger but on Kaggle, humans still oust results generated by AutoML tools. Increasingly, a lot of data scientists are also relying on AutoML tools to optimise model performances.


Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Richa Bhatia
Richa Bhatia is a seasoned journalist with six-years experience in reportage and news coverage and has had stints at Times of India and The Indian Express. She is an avid reader, mum to a feisty two-year-old and loves writing about the next-gen technology that is shaping our world.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.