“There is no alternative to mathematics.”
For this week’s ML practitioner’s series, we got in touch with Sanjeev Sharma, Founder of Swaayatt Robots. In this interview, Sanjeev shares invaluable insights from his decade long journey in the world of machine learning research and the lessons from starting an autonomous driving company in India.
Sanjeev is also a recipient of the Leading 40 Under 40 Data Scientists in India award, at the Machine Learning Developers Summit for his research in autonomous driving technology over the past four years, which enabled autonomous driving on Indian roads — world’s toughest test ground for autonomous driving.
AIM: Can you talk about your introduction to the world of AI?
Sanjeev: I did my bachelors (2007-11) from IIT Roorkee in electrical engineering. Around this time, I heard about the DARPA Grand challenge and how one of the engineers from Stanford University built autonomous flying machines using reinforcement learning. This was the first time I had heard about reinforcement learning. As I started looking more into this challenge, I stumbled upon several videos on YouTube by MIT and all the other universities working on autonomous driving technology. I got curious about autonomous vehicles and decided this will be my future.
Even before joining IIT, I was interested in robotics since 10th grade. But, as is the norm in India, that you choose your branch according to your rank — I got electrical engineering as a branch for my rank. After looking at the electrical engineering curriculum, my interest in college totally declined.
I began to self-study. I started reading hundreds of machine learning research papers during my undergrad.
After graduation, I worked as a research assistant where my task was to develop motion planning algorithms that enable robots to navigate at very high speeds and cluttered environments. Later, I went to Canada for Masters at the University of Alberta where I got to work on reinforcement learning with pioneers like Richard Sutton. On completion of my masters, I got an opportunity to pursue a PhD in computer science from the University of Massachusetts. Instead of going for a PhD, I came back to India and founded Swaayatt Robots.
AIM: What does the self-learning process actually look like?
Sanjeev: During my first summer vacation, I started with microcontrollers theory and simultaneously started studying about image processing. I started from a research point of view, focusing on reinforcement learning and motion planning.
I thought about what a graduate student in one of the top universities in the world would have for his coursework. I picked those subjects, read books in the summer between semesters; books that are usually taught for a whole semester or two. I read ‘Pattern Recognition & Machine Learning’ by Christopher Bishop, cover to cover.
AI is nothing but applied mathematics. The more mathematical you are, the better. Any serious researcher in machine learning would eventually end up learning about mathematical optimisation. Having learned books on numerical and convex optimisation gave me the much needed mathematical background for future endeavours.
I have read more than around 300 research papers in motion planning and reinforcement learning combined. That also helped me to get my first internship which in turn gave me a real-world research experience in reinforcement learning. Even now, I’m studying mathematical topology to get some ideas related to motion planning and decision making for tough scenarios.
Book recommendations by Sanjeev
– Pattern Recognition and Machine Learning by Christopher Bishop
– Machine Learning: A Probabilistic Perspective by Kevin P. Murphy
–Data Mining by Witten Frank and Hall
– Numerical Optimisation by Jorge Nocedal and Steve Wright
– Convex Optimisation by Stephen Boyd
– Reinforcement Learning: An Introduction by Richard S. Sutton
Many of these books would have been taught over a semester in universities. I read them in summer, during semester breaks!
AIM: How important is to have a PhD in Machine Learning
Sanjeev: The answer will be subjective. It cannot be objective. In my case, I was able to drop the idea of a PhD because I had a mathematical background. I was binge reading research papers during my undergrad. By the time I was ready for my master’s degree, I had already published more papers than what a typical PhD would demand.
Going to masters or a PhD program is all about the goal. If you are the one who is interested in developing the state-of-the-art, then you should have either done masters or a PhD to have the research aptitude. To actually solve problems in robotics or autonomous driving, you need advanced knowledge.
But if you are going to create a startup with no intention of moving the envelope of research, then you don’t have to care about the state-of-the-art or about an advanced degree. You just have to take a deep learning model from the internet, fine-tune it. You don’t even need a bachelor’s degree for that.
In my case, in particular, my goal was pretty strict. It had to be autonomous driving. I got into machine learning because I liked robotics and not the other way around. The goal has to be defined only, then this question can be answered appropriately. That’s why I said that the answer is subjective.
“Online courses or MOOCs can only give an introduction. Just on the basis of the availability of MOOCs and online courses, I mean, dropping the degree would be suicidal.”
Although there are tremendous resources in terms of online courses right now, we have to be careful about what they offer. MOOCs are a very diluted version of the typical graduate school courses that you’ll end up doing in a graduate program. So although these are very good introductions, taking them as a replacement for a proper mathematical study or proper research is not advisable.
That said, as long as you have excellent knowledge, the choice of sources — books, online courses or a graduate PhD program — is irrelevant.
AIM: What does it take to start an autonomous driving company?
Sanjeev: I founded Swaayatt Robots in 2015 with a goal to make connected autonomous driving technology accessible, affordable and available to everyone. But, we didn’t have enough funding. All I had was a solid mathematical background, and that helped me a lot.
As I mentioned earlier, my interests were around reinforcement learning and motion planning. These topics are helpful in trajectory planning, motion planning and decision making for autonomous vehicles. There are other aspects of autonomous driving, such as perception and localisation.
Back when I started the company, the perception problem was being solved within LiDAR cameras. Importing a LiDAR alone would cost me around INR 1 cr., and I started the company with just 35 lakhs!
We started researching the area of computer vision and deep learning. I didn’t have any formal research training in computer vision. But I had the mathematical background which helped me develop confidence. My peers said that it was impossible and, at some point, you have to use LIDAR.
“Importing a LiDAR alone would cost around 1 cr. And I started the company with just 35 lakhs!”
So I took it as a research endeavour to actually solve this problem of perception/ computer vision, especially using only the cameras. Today we are one of the very few companies in the world who have enabled autonomous vehicles to perceive their environment, using only cameras in real-time.
In obstacle detection, we are 40 times faster compared to the state-of-the-art. When it comes to semantic segmentation, we are 20 times faster than the state-of-the-art. When we talk about the delimiters detection problem, we have a framework that is at least six times faster than what can be developed in 2019.
Companies across the world invest billions of dollars in detecing delimiters. Our models can even detect delimiters on Indian roads. Delimiters can be road partition or lane marking, which can be absent on many roads. So how do you detect something that doesn’t even exist?
The challenges are mostly mathematical in nature. For instance, when the Analytics India Magazine team came for a demo, our vehicle’s steering was very jerky. You can solve those mechanical problems by buying a new vehicle or by owning an industrial facility where you can develop all the mechanical design. But, we didn’t have the resources. So I’ve converted this challenge into a mathematical one by formulating the mechanical jerk as a reinforcement learning problem.
Today autonomous driving startups focus mostly on solving one tiny problem but not the whole stack. Only companies like Tesla can afford the entire stack of perception planning and localisation. Outsourcing tasks of perception to some other company would only complicate the process.
“Funding is hard. You have to develop the best algorithms that can not just just solve for the purpose, but actually the state-of-the-art.”
For self-driving cars, high fidelity maps are a must. These maps contain information like lane markers, which can be leveraged to navigate. The sensory data is taken and is matched with the maps. This matching process is very complex. It cannot be done in real-time using onboard computers. So you actually outsource your action to some server.
Added to this is the problem of internet connectivity. This mapping process is costly and time-consuming.
So, my goal was to kill this requirement. I have developed some novel algorithms in both perception and planning, which give the system the ability to generate information in real-time. This got rid of the requirement of high fidelity maps.
Typically, companies rely on high fidelity maps for projecting and generating delimiters because they cannot develop algorithms robust enough to detect delimiters reliably in real-time. Furthermore with unstructured environment, such information can be non-existent. This becomes a much bigger challenge. We have developed algorithms to not just detect, but generate delimiters in real-time. Today, we are one of the three companies in the world who enable autonomous driving without high-fidelity maps. At Swaayatt Robots, over the past five years, we have delivered on all three fronts — perception, planning and localisation.
AIM: What does your machine learning toolkit look like?
Sanjeev: We program in C++ and CUDA. CUDA is basically almost a C++ but just a program written for GPU. For computers, we use the Nvidia GPU. As far as libraries are concerned, we use cuDNN. I don’t use PyTorch or TensorFlow. Though obscure, I prefer Theano. I like the control it offers. However, my team members use TensorFlow and PyTorch. I prefer writing my own libraries as most of the mathematical functions we require might not be available in the popular frameworks. I usually write them from scratch.
In our office, we have around 11 computers. I wrote the entire library on my own in C++, to actually combine all the 11 computers to achieve a processor that can handle 30 terabytes of data and train. Scaling up again with stochastic updates again boils down to mathematical tricks. Not programming. You can optimise your code as much as you want but if you want to accurately grasp the information of trajectory from a camera, you might need knowledge of homotopy, which is a subset of algebraic topology.
“Currently, we use 6 GPUs from Nvidia’s GTX series 1070, 970 &980 ti.”
Now you might ask why not use cloud. Regarding the hardware, we always go with GPUs. The data is immense. We cannot use Cloud because of the confidentiality of what we do. We have around 1.5 million images that we train on. It takes around 14 days to actually train that network.
For example, we have developed deep energy maps for contextual segmentation of the surroundings of the self-driving vehicles to help them better perceive their environments in all conditions. We train a neural network with a loss function, an advancement that I made for semantic segmentation. So now, the moment you put your network in a cloud, your loss function is up there too. And that’s very confidential for us. I mean, that’s our entire USP .
AIM: What do you look for while hiring ML engineers?
Sanjeev: As far as the undergrad internships are concerned, I just look into the ability of the interns to actually work and think independently, and whether they are mathematically trained enough to understand the complexity of the task. For full-time candidates, the more mathematical you are, the better. I don’t take programming interviews. Nowadays, I take an interview only if I have to hire someone for a mathematical reason. If you are applying for an engineering position, I assume that you already have read one of the books I have mentioned earlier. And along with that, you need to have the ability to really understand research.
I believe that you can learn programming at any given time, but mathematics is harder to learn later in life. So undergrad time has to be utilised very, very carefully.