Active Hackathon

Dissecting Microsoft’s Autonomous Driving Cookbook: A Fresh Perspective On Self Driving Cars With ML



With innovations in data sciences and deep learning fueling technology, self-driving cars are soon going to be a reality. As automobile giants such as Toyota, General Motors and electric vehicle manufacturer Tesla are already looking at the production of autonomous cars, the necessity for people to drive will be completely eliminated in the future. All you no need to do is sit in the car, relax and let the technology drive you to the destination without worries.


Sign up for your weekly dose of what's up in emerging technology.

Tech giant Microsoft too has come up with a car simulation by integrating their software platforms  to develop and further their research for self-driving cars under their open-source research program called AirSim to serve artificial intelligence systems. The entire work is made open-source on GitHub to make better improvements using insights from across the software development community.

Understanding AirSim: Simulation is the key

Data is growing enormously day by day. Even though data collection methods are advanced, sometimes absorbing complex data becomes burdensome and they are of no match to meet precise data demands of artificial intelligence. One surefire way to collect data is to use simulation to build AI models. Simulation also uses various scenarios (roads, pedestrians, traffic and so on in addition to collecting data, which would take longer times in real world– days, months maybe). With the advent of leading-edge techniques such as Behavioral Cloning and Reinforcement Learning, training an AI model is much easier and less data can be used to work on these trained models.

The project AirSim is built using Unreal Engine, a game engine known for developing popular video games such as Unreal Tournament and Borderlands franchise. It provides a simulated metropolitan environment with entities such as road, buildings, traffic, pedestrians and many more to work with. The application programming interfaces (APIs) supports Python and C++, which fosters the use of machine learning and deep learning tools. Microsoft is primarily using  Microsoft Cognitive Toolkit (CNTK) to develop deep learning algorithms for this project. It also plans to integrate its cloud-storage platform Azure to fulfill data storage needs.

How does it work?

You need to have the following software packages/tools and hardware requirements installed before running the procedure based on your operating system (Windows, macOS or Linux).

  • H5py, Python-based package used to store numerical data.
  • Keras, Python-based neural network API which requires the above mentioned tools to run. It also needs to be configured correctly to run.
  • Graphics Processing Unit abbreviated GPU (Nvidia GeForce GTX970 or latest series recommended)

The Procedure:

The tutorial for autonomous driving is written using Keras, a deep learning Python API which can be run under CNTK, Theano or TensorFlow. The source for initiating the simulation would be the lines of code written in Python, called Python Notebooks along with the dataset captured which is made available freely. The tutorial begins with the process of analysing data from the dataset, training a learning model using that dataset and finally testing the trained model in AirSim.

1 . Interpreting data

The input data feed would be the camera images attached to a car driven by a person. A deep learning model is developed based on the steering angles observed during the run of the car.

Cars require extensive amount of data to be able to achieve successful autonomy which would be an impossible feat without simulation.

The good news is, Microsoft has already compiled data and has presented in the form of a dataset. The dataset consists of two parts – images and tab-separated values (.tsv) files. The images containing the steering view is considered. To emphasize, steering angle is the primary physical variable considered for the entire simulation.

End-to-end deep learning is the technique used to sift data, which employs deep neural networks leveraging the GPU’s processing power. When clubbed with a simulator such as AirSim, lesser number of real-world data can be used to develop and train a model.

The strategy chosen is to perform basic data analysis for a part of that dataset — using Python for coding, and then train an end-to-end deep learning model to predict the correct steering angle given on a  frame from the camera, with the car’s other parameters such as speed, steering angle and braking.

Initiating the program using Python code

From a sample image in the dataset, only a portion of that image is taken for simulation. This is to extract valid data points which are of interest with respect to steering view. Now, the data points are segregated into two folders based on two strategies namely, ‘normal’ and ‘swerve’. The points/ values are then plotted on a distributed smooth line graph (shown below).

Data points for Normal and Swerve angles

2. Training a model

Once the data is finalised, a model is developed using Python code by importing image processing libraries from Keras. The key areas to look out for training a model from the images are:

  • Portion of the image considered in the previous step (Region of Interest)
  • Changes in lighting
  • Swerving strategy

The above points are resolved using 3 parameters namely, zero_drop_percentage (which data to fall under the value = 0), brighten_range (to alter the brightness of image) and ROI (the coordinates that represent the region of interest). These are reflected in the image ( see steering angle depiction image

Using Keras to process the image using specific parametres
Steering Angle depiction in the simulation

Once all this is done, a combination of convolutional and max pooling layers are used to state a network architecture for the image. Then, a series of iterations are given to see how it works — sort of trial and error method. A sample input parameters code screen is depicted below. You might be surprised with other parameters, but concentrate only on the ones which can be played with to obtain a consistency in the driving pattern.

Convolutional and max pooling layer parametres

3. Testing the model

Now, the developed model is tested in AirSim by loading onto it. The code is inputted into the software and the simulation is good to go.

Python Code to load the data into AIrSim

You can watch the simulation demo provided by Microsoft here


The simulation worked well during the run with the vehicle pretty much sticking to the road. There are many factors at play such as detection of obstacles on the path, responses to sharp turns and so on, which Microsoft is already working towards development. This article was just a taste of what various facets of machine learning could offer to self-driving. Sooner or later, cars will be completely lacking human interaction without any interruption.

More Great AIM Stories

Abhishek Sharma
I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM