MITB Banner

AWS Announces Amazon Redshift ML, A Cloud-based Service For Data Scientists To Use ML Technologies

Share

Amazon HealthLake

Recently at the AWS re:Invent event, the e-commerce giant announced the launch of Amazon Redshift Machine Learning (Amazon Redshift ML). According to its developers, with Amazon Redshift ML data scientists can now create, train as well as deploy machine learning models in Amazon Redshift using SQL. 

Amazon Redshift is one of the most widely used cloud data warehouses, where one can query and combine exabytes of structured and semi-structured data across a data warehouse, operational database, and data lake using standard SQL. The cloud data warehouse is well-known for its intuitive features, such as efficient storage, scalability, high-performance query processing, result caching and more.

Technology Behind Redshift ML

Amazon Redshift ML is powered by Amazon SageMaker, which is a fully managed ML service. Here one can use SQL statements to create and train machine learning models from data in Amazon Redshift. The models then can be used for applications like churn prediction and fraud risk, among others.

As per a blog post, with the release of this data warehouse application, it will now support supervised learning techniques, which is most commonly used in enterprises for advanced analytics. It will allow users to use their data in Redshift without requiring any in-depth knowledge in machine learning techniques.

While working with this ML applications, one should consider the following:

  • The new Amazon Redshift clusters must be created with the SQL_PREVIEW maintenance track. 
  • The Amazon Redshift cluster that is used to create the model and the Amazon S3 bucket that is used to stage the training data and model artefacts must be in the same AWS Region.
  • A user will not be able to switch an existing Amazon Redshift cluster from the current or trailing track to this preview track, or vice versa.

Why Use Redshift ML

Amazon Redshift is used to process exabytes of data every day to power the analytics workloads. This data can be leveraged by data scientists and analysts for training ML models. The models can then be used to generate insights into new data.

The key benefits of using Amazon Redshift ML is that it automatically detects as well as tunes the fittest model based on the training data using the Amazon SageMaker Autopilot. The SageMaker Autopilot chooses among the best regression, binary, or multi-class classification and linear models.

Besides the above-mentioned importance, there are some more interesting benefits that this application provides:

  • Amazon Redshift allows a user to create and train ML models with simple SQL commands without having to learn external tools.
  • It provides flexibility to use automatic algorithm selection.
  • The application automatically preprocesses data, and creates, trains and deploys models.
  • It enables advanced users to specify problem type and generate predictions using SQL without having to ship data outside your data warehouse.
  • It also allows data scientists to select efficient algorithms such as XGBoost and specify hyperparameters and preprocessors.

How It Works

When users run SQL commands to create the model, Amazon Redshift ML exports the specified data in a secured manner from Amazon Redshift to Amazon S3 and calls the SageMaker Autopilot to prepare the data automatically. It then selects the relevant pre-built algorithm as well as apply it for training the ML model.

According to its developers, this application manages all the intercommunications between Amazon Redshift, SageMaker and Amazon S3 while abstracting the steps involved in training and compiling. After the model is successfully trained, the Redshift ML application makes it available as a SQL function in the Amazon Redshift data warehouse by compiling it via Amazon SageMaker Neo. 

Wrapping Up

Amazon Redshift ML is a cloud-based service that makes it easy for analysts and data scientists to use machine learning technology. It doesn’t come with any additional charge for creating or using a model, and prediction happens locally in your Amazon Redshift cluster. This means the application only allows you to pay only for training; the prediction is included with the costs of your cluster, majorly driven by ML predictions. Also, the machine learning preview period is expected to run until March 31, 2021.

Share
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.