Deploying a machine learning service on a cloud platform is a huge advantage to all ML practitioners, as it provides a flexible platform for the design and development of various models. It provides serverless cloud engines giving the ML advantages of leveraging their models on a cloud platform since these are generally computationally heavy.
Sign up for your weekly dose of what's up in emerging technology.
Google Datalab and Amazon SageMaker are very closely related to each other in many features, but also have many differences. Here is a detailed comparison between the two services/platforms:
Google Datalab: The notebook server setup procedure is easy. It is launched using the Google cloud shell which is in the Google Cloud Console interface. Google Cloud SDK can also be used for notebook deployment.
Amazon SageMaker: Once logged into the SageMaker console, the deployment of the notebook is only a click away.
2. Customised Algorithms
Google Datalab: It does not contain any pre-customised ML algorithms. But using the Google Cloud ML service, it provides a platform to run the models built with the help of TensorFlow.
Amazon SageMaker: It has pre-optimised ML algorithms. These algorithms can run on Amazon’s compute servers. In order to use these algorithms, one needs to connect them to the data source, the objective of which is to help beginners in ML make use of it in their products. The SageMaker custom algorithms have a variety of supervised, unsupervised and deep learning algorithms.
3. Model Deployment
Google Datalab: There is no direct way to handle the code deployment into production servers. But the model built on this platform is packed into a Python module and deployed on Google CloudML. The only way to deploy one’s own algorithm is by writing it in TensorFlow.
Amazon SageMaker: There is a provision for direct deployment of the trained ML models, unlike in Datalab. The deployment is done to elastic compute infrastructure with high availability. It comes with an HTTPS endpoint where the ML model is available to provide inferences where the user can deploy multiple variants of a model to the same HTTPS endpoint. It has a much easier deployment of ML models, compared to Datalab.
4. Automated Hyper-Parameter Tuning
Hyperparameters are the parameters that describe and govern the model training process and need to be initialised before the training starts. Hyperparameter tuning means multiple trials being run in a single training job, where each trial is a complete execution of the training application with values set within the specified limits.
Google Datalab: It does not provide automated hyper-parameter tuning. But it has something called as a HyperTune which helps in automatic optimisation of the ML model for an improved accuracy/minimized error. It provides this feature for TensorFlow models.
Amazon SageMaker: It provides an option for automated hyper-parameter tuning on the ML model during the training period. It finds the best hyper-parameters for algorithm training to the user. This feature is available for not just built-in algorithms but also for external training in docker.
Google Datalab: Pricing is based on usage. But in the case of certain customer use cases and demands, it could have different prices. You can look at the price structure here.
Amazon SageMaker: Pricing of the SageMaker is affordable at <price> and is totally based on the usage. As part of the AWS Free Tier, SageMaker is available for free. Pricing depends on the on-demand ML instances, ML storage, and fees for data processing in notebooks and hosting instances. You can get more information regarding the SageMaker price here.
Both are equal.
6. Inbuilt Libraries
Google Datalab: Datalab comes with a Jupyter Notebook. It does not have any in-built notebook libraries with MxNet and Apache Spark. But it has notebook kernels when used with TensorFlow.
Amazon SageMaker: It has pre-installed notebook libraries that run on Apache Spark and MxNet, along with being able to run on TensorFlow.
Which One Should You Choose
Google Cloud Datalab is a standalone serverless platform. It is used for building and deploying ML models. It has to be used with other services like the Google Cloud ML to make it a more powerful ML service. Whereas Amazon SageMaker is built for complete end-to-end ML services. Looking at all the comparisons above, SageMaker should definitely be your choice.