Recently, the developers at Google Cloud announced the general availability of the AI Platform Prediction. The platform is based on a Google Kubernetes Engine (GKE) backend and is said to provide an enterprise-ready platform for hosting all the transformative ML models.
Emerging technologies like machine learning and AI have transformed the way most processes and industries work around us. Machine learning has brought various significant features that require predictions, such as identifying objects in images, recommending products, optimising market campaigns and more.
Sign up for your weekly dose of what's up in emerging technology.
However, building a robust and enterprise-ready machine learning environment can include various issues like it being time-consuming, costly as well as complex. Google’s AI Platform Prediction takes into account all these issues to provide a robust environment for ML-based tasks.
In March this year, the tech giant launched the AI Platform Pipelines in beta version to ensure in delivering an enterprise-ready and a secure execution environment for the machine learning workflows.
According to the developers, this new platform is designed for various functions in machine learning models such as improved reliability, more flexibility via new hardware options such as Compute Engine machine types and NVIDIA accelerators, reduced overhead latency, and improved tail latency.
Behind AI Platform Prediction
AI Platform Prediction is one of the key components of the AI Platform. It is a platform to train machine learning models, host trained models in the cloud and use the ML model to make predictions about the new data.
It brings the power and flexibility of TensorFlow, Scikit-Learn and XGBoost to the cloud. The AI Platform Prediction service allows a user to serve predictions based on a trained model, whether or not the model was trained on the AI Platform.
In addition to the standard features such as autoscaling, access logs, and request or response logging available during the Beta period, this time, the developers introduced several updates in this platform that improve the robustness, flexibility as well as usability of the models.
Some of the new features are mentioned below:
XGBoost/Scikit-Learn Models on High-mem/high-CPU Machine Types:
AI Platform Prediction includes the power of XGBoost and Scikit-Learn models for predictions in production. This makes the platform simple to deploy on models trained using these frameworks.
Resource metrics are now visible for models deployed on GCE machine types from Cloud Console and Stackdriver Metrics.
In this platform, the developers have introduced new endpoints in three regions (us-central1, Europe-west4, and Asia-east1) with better regional isolation for improved reliability.
The VPC-Service Controls is currently in the beta version where users can define a security perimeter and deploy Online Prediction models that have access only to resources and services within the perimeter or another bridged perimeter.
How AI Platform Prediction Works
In order to run the machine learning models, the AI Platform Prediction manages computing resources in the cloud. Using this platform, users can request predictions from the ML models and perceive the predicted target values for them.
Below are the steps to set up predictions in the cloud:
- The machine learning model is needed to be exported as artifacts so that it can be deployed to AI Platform Prediction.
- Next, a model resource needs to be created in AI Platform Prediction along with a model version from the saved model. In case of deploying a custom prediction routine, users also need to provide the code to run at prediction time.
- The next step is to format the input data for prediction and request either online prediction or batch prediction.
- When users use online prediction, the service runs the saved model and returns the requested predictions as the response message for the call. In this process, the model version is deployed in the region that the user specifies while creating the model.
- While using the batch prediction in TensorFlow, it includes a few more steps such as use of prediction nodes during resources allocated by the prediction service.
Besides serving the trained models, the AI platform also integrates with other important AI technologies to simplify the ML workflows. This includes explainable AI, what-if tool to visualise data and continuous evaluation to obtain metrics about the performance of the ML model.
Bhupesh Chandra, Senior Engineer at Google Cloud, stated in a blog, “All of these features are available in a fully managed, cluster-less environment with enterprise support, no need to stand up or manage your own highly available GKE clusters.”