What is Google Cloud ML Engine?

Learn how to to scale up a machine learning algorithm.

Narin Luangrath

The cloud and machine learning: two phrases with a lot of hype that few people understand. We’re intimately familiar with both here at Leverege, so hopefully this article will shed some light on the two topics.

Before we share what we’ve learned using Google Cloud ML Engine, we need to do a quick refresher on how machine learning is done in production. There are roughly 4 steps:

  1. Identify the problem you want to solve. For instance, you want an algorithm that can classify emails as spam or not, so that your inbox isn’t flooded with junk mail.
  2. Find/build a machine learning model that solves the problem. Using the above example, you might try an ML algorithm called ‘Naive Bayes’.
  3. Train your ML model. All machine learning models need to be trained before they can be used. You’ll need to feed your ‘Naive Bayes’ model examples of spam/legitimate emails so they can learn how to classify new emails.
  4. Deploy your ML model. Now that you have a trained ‘Naive Bayes’ spam filter, use it to filter new emails.

So how does Google Cloud ML fit into all of this? It helps solve steps three and four: training and deployment.

Google Cloud ML


It’s possible to train machine learning models on your personal laptop, but it’s not scalable. If you’re training your ‘Naive Bayes’ classifier on thousands of emails, your laptop might be good enough. But what if you have millions of training emails? Or what if your training algorithm is very complicated and takes a long time to run? This is where Cloud ML comes into play.

With Cloud ML engine, you can train your ML model in the cloud using Google’s distributed network of computers. Instead of just using your laptop to train your model, Google will run your training algorithm on multiple computers to speed up the process. Furthermore, you can configure the types of CPUs/GPUs these computers run on. There are some algorithms that run a lot faster if you use GPUs instead of CPUs.

A Google data center.

Another benefit we’ve found of training with Cloud ML Engine is that you don’t have to worry about storing the training data. If you have a million emails to train your spam filter, how are you going to get them on your laptop to train your model? When you train your model using Cloud ML, you can easily store your training data online in a Google Cloud Storage “Bucket”.

TensorFlow: Google’s open-source ML Framework

The steps involved with building a machine learning model in TensorFlow and packaging your code so that Cloud ML Engine can process it are a bit complicated and beyond the scope of this high level overview. You can start learning about the details here.

However, when you’ve built the model and are ready to train, submitting the training job to Google Cloud ML Engine is just a quick shell script command:

gcloud ml-engine jobs submit training $JOB_NAME \  
   –job-dir $OUTPUT_PATH \   
   –runtime-version 1.2 \
   –module-name trainer.task \
   –package-path trainer/ \
   –region $REGION \
   –scale-tier STANDARD_1 \
   — \
   –train-files $TRAIN_DATA \
   –eval-files $EVAL_DATA \
   –train-steps 1000 \
   –verbosity DEBUG  \
   –eval-steps 100

The important part to understand is that trainer.task is the file that contains your TensorFlow application and STANDARD_1 specifies that you want to use multiple computers (distributed computing) to train your model.   


Now that you’ve trained your model and verified that it works correctly, how do you use it to make new predictions? For instance, suppose you have an amazing ‘Naive Bayes’ ML spam filter, how do you make that service available to others?

One thing you could do is build a backend web server, like www.my-spam-filtering-api.com (not a real website) that takes in emails as POST requests and responds with the ML algorithm’s guess about whether or not the email is spam. But what if your backend web server gets tens of thousand of requests? Will it be able to handle the load or will it break?

Google Cloud ML let’s you avoid the complexity of building a scalable machine learning web server by doing it for you. If you put your machine learning algorithm on Google Cloud ML, it can handle predictions for you. Right now, it supports “online predictions” and “batch predictions”.

Use online predictions if you have a couple (i.e. ten) data points that you want to run through your algorithm. For larger datasets (i.e. thousands of points) use batch predictions.            

Run your ML applications on Google’s infrastructure.


Our team has found that Google Cloud ML Engine is a great way to scale up a machine learning algorithm as you can train models on larger datasets in a shorter amount of time. It makes the transition from building the model to actually making predictions quick and easy, so it may be worth considering for your machine learning workflow.

Narin Luangrath
Narin Luangrath
Narin Luangrath is a Product Engineer at Leverege who follows machine learning and artificial intelligence developments. He recently graduated from Wesleyan University with degrees in Math, Computer Science and Data Analysis.
Narin Luangrath is a Product Engineer at Leverege who follows machine learning and artificial intelligence developments. He recently graduated from Wesleyan University with degrees in Math, Computer Science and Data Analysis.