There has been rapid growth and advancements in AutoML systems over the last few years. AutoML automates the full development lifecycle for enterprise AI and ML applications, and makes it possible for a data scientist to automate the optimisation and selection of ML models, but it does encounter some limitations. Now, with the next version, AutoML 2.0, these systems plan to automate the most complicated, and time-consuming part of the enterprise AI development lifecycle – feature engineering, which typically takes months using traditional methods.
The previous version of the AutoML platforms has been more about automating the machine learning part of data science. But, one of the most challenging parts of traditional data science is feature engineering, which involves a lot of manual activity. Feature engineering consists of connecting data and building a feature data table with a set of diverse features that will be evaluated against multiple machine learning algorithms. The problem with feature engineering is that it requires high domain expertise as it involves ideating new features. This involves a lot of iteration as features are evaluated and rejected or chosen. Now, platforms with automated feature engineering capabilities allow for automated creation of feature tables from relational data sources and flat files. This ability to generate features automatically in data science is impactful and game-changing.
Not only automation, but AutoML 2.0 will also offer BI analysts, data engineers and others in an organisation with deep domain knowledge to contribute towards the development of ML and AI models. With automation in feature engineering, BI teams have the opportunity to develop sophisticated algorithms in a matter of days.
If one thinks AutoML 2.0 is going to replace data scientists, then they would be wrong. The sole purpose of AutoML 2.0 is to enhance the data scientist’s productivity. Feature engineering is seen as one of the most significant hurdles faced by data scientists, and automation can only help accelerate the process and make the field more open to other departments in an organisation.
The use of AI-based features in AutoML 2.0 platforms allow data scientists to discover some of the features that would not have been discovered by data scientists. This AI-based feature engineering automatically builds, evaluates, and exposes features by combining data from different columns across various tables and sources. This ability of AutoML 2.0 allows data scientists to explore features known as ‘unknown unknowns’; something data scientists do not delve into, either because they do not have the required time on their hands, or because they lack the domain expertise.
Apart from enhancing data scientist’s productivity by automating feature engineering and helping discover unknown unknowns, AutoML 2.0’s biggest USP is the opportunity it provides with democratisation. When the AutoML 2.0 platform can accelerate and automate the process of discovering and creating features, it opens up to a more diverse and abundant group of users to contribute to the data science process. Automation of feature engineering helps citizen data scientists create incredibly useful and optimised use cases. These citizen data scientists have high domain expertise, and with automation, they will focus on high priority use cases with very little help from data science teams. Amid economic uncertainty, automation might help organisations avoid hiring data scientists unnecessarily.
AutoML 2.0 has a two-fold advantage in democratisation and automation, and it uses AI/ML capabilities to achieve these two. While the concept of AutoML 2.0 might sound like it can spell doom for data scientists, it must be kept in mind that it is nothing but a tool to enhance their productivity. AutoML 2.0 not only enhances the data scientist’s work, but also gives them the ability to scale their work.