Now Reading
Dissecting Microsoft’s Autonomous Driving Cookbook: A Fresh Perspective On Self Driving Cars With ML

Dissecting Microsoft’s Autonomous Driving Cookbook: A Fresh Perspective On Self Driving Cars With ML

Abhishek Sharma


With innovations in data sciences and deep learning fueling technology, self-driving cars are soon going to be a reality. As automobile giants such as Toyota, General Motors and electric vehicle manufacturer Tesla are already looking at the production of autonomous cars, the necessity for people to drive will be completely eliminated in the future. All you no need to do is sit in the car, relax and let the technology drive you to the destination without worries.

Tech giant Microsoft too has come up with a car simulation by integrating their software platforms  to develop and further their research for self-driving cars under their open-source research program called AirSim to serve artificial intelligence systems. The entire work is made open-source on GitHub to make better improvements using insights from across the software development community.

Understanding AirSim: Simulation is the key

Data is growing enormously day by day. Even though data collection methods are advanced, sometimes absorbing complex data becomes burdensome and they are of no match to meet precise data demands of artificial intelligence. One surefire way to collect data is to use simulation to build AI models. Simulation also uses various scenarios (roads, pedestrians, traffic and so on in addition to collecting data, which would take longer times in real world– days, months maybe). With the advent of leading-edge techniques such as Behavioral Cloning and Reinforcement Learning, training an AI model is much easier and less data can be used to work on these trained models.

The project AirSim is built using Unreal Engine, a game engine known for developing popular video games such as Unreal Tournament and Borderlands franchise. It provides a simulated metropolitan environment with entities such as road, buildings, traffic, pedestrians and many more to work with. The application programming interfaces (APIs) supports Python and C++, which fosters the use of machine learning and deep learning tools. Microsoft is primarily using  Microsoft Cognitive Toolkit (CNTK) to develop deep learning algorithms for this project. It also plans to integrate its cloud-storage platform Azure to fulfill data storage needs.

How does it work?

You need to have the following software packages/tools and hardware requirements installed before running the procedure based on your operating system (Windows, macOS or Linux).

  • H5py, Python-based package used to store numerical data.
  • Keras, Python-based neural network API which requires the above mentioned tools to run. It also needs to be configured correctly to run.
  • Graphics Processing Unit abbreviated GPU (Nvidia GeForce GTX970 or latest series recommended)

The Procedure:

The tutorial for autonomous driving is written using Keras, a deep learning Python API which can be run under CNTK, Theano or TensorFlow. The source for initiating the simulation would be the lines of code written in Python, called Python Notebooks along with the dataset captured which is made available freely. The tutorial begins with the process of analysing data from the dataset, training a learning model using that dataset and finally testing the trained model in AirSim.

1 . Interpreting data

The input data feed would be the camera images attached to a car driven by a person. A deep learning model is developed based on the steering angles observed during the run of the car.

Cars require extensive amount of data to be able to achieve successful autonomy which would be an impossible feat without simulation.

The good news is, Microsoft has already compiled data and has presented in the form of a dataset. The dataset consists of two parts – images and tab-separated values (.tsv) files. The images containing the steering view is considered. To emphasize, steering angle is the primary physical variable considered for the entire simulation.

End-to-end deep learning is the technique used to sift data, which employs deep neural networks leveraging the GPU’s processing power. When clubbed with a simulator such as AirSim, lesser number of real-world data can be used to develop and train a model.

The strategy chosen is to perform basic data analysis for a part of that dataset — using Python for coding, and then train an end-to-end deep learning model to predict the correct steering angle given on a  frame from the camera, with the car’s other parameters such as speed, steering angle and braking.

Initiating the program using Python code

From a sample image in the dataset, only a portion of that image is taken for simulation. This is to extract valid data points which are of interest with respect to steering view. Now, the data points are segregated into two folders based on two strategies namely, ‘normal’ and ‘swerve’. The points/ values are then plotted on a distributed smooth line graph (shown below).

See Also
Reinforcement Learning

Data points for Normal and Swerve angles

2. Training a model

Once the data is finalised, a model is developed using Python code by importing image processing libraries from Keras. The key areas to look out for training a model from the images are:

  • Portion of the image considered in the previous step (Region of Interest)
  • Changes in lighting
  • Swerving strategy

The above points are resolved using 3 parameters namely, zero_drop_percentage (which data to fall under the value = 0), brighten_range (to alter the brightness of image) and ROI (the coordinates that represent the region of interest). These are reflected in the image ( see steering angle depiction image

Using Keras to process the image using specific parametres
Steering Angle depiction in the simulation

Once all this is done, a combination of convolutional and max pooling layers are used to state a network architecture for the image. Then, a series of iterations are given to see how it works — sort of trial and error method. A sample input parameters code screen is depicted below. You might be surprised with other parameters, but concentrate only on the ones which can be played with to obtain a consistency in the driving pattern.

Convolutional and max pooling layer parametres

3. Testing the model

Now, the developed model is tested in AirSim by loading onto it. The code is inputted into the software and the simulation is good to go.

Python Code to load the data into AIrSim

You can watch the simulation demo provided by Microsoft here


The simulation worked well during the run with the vehicle pretty much sticking to the road. There are many factors at play such as detection of obstacles on the path, responses to sharp turns and so on, which Microsoft is already working towards development. This article was just a taste of what various facets of machine learning could offer to self-driving. Sooner or later, cars will be completely lacking human interaction without any interruption.

What Do You Think?

If you loved this story, do join our Telegram Community.

Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
What's Your Reaction?
In Love
Not Sure

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top