“Implementing seminal works in Perception, Planning and ML, which looked like a daunting task before, became a weekly ritual as part of the many courses and projects I undertook at CMU.”Shivam Gautam
For this week’s machine learning (ML) practitioner’s series, Analytics India Magazine(AIM) got in touch with Shivam Gautam, a tech lead within the Perception team at Aurora, a leading autonomous trucking company in the US founded by ex-Tesla, Waymo, and Uber leadership. Shivam has worked at Uber ATG for close to four and a half years before its merger with Aurora. He was part of the team that created state-of-the-art methods for object detection, tracking and sensor fusion. In this interview, Shivam shares a few snippets from his exciting journey in the field of autonomous robots and machine learning.
AIM: How did your machine learning journey begin?
Shivam: I studied Electrical and Electronics Engineering from Delhi Technological University (Delhi College of Engineering) and completed my Master’s from Carnegie Mellon University’s Robotics Institute, where I studied Robotic Systems Development. My fascination with ML co-developed with my interest in robotics. My undergrad was interspersed with many robotics projects – I built everything from fixed-wing UAVs and VTOL UAVs to self-balancing robots and autonomous antenna arrays. My time at UAS-DTU was critical to my development as an engineer. I was part of a student team working on autonomous UAVs and was then sponsored and mentored by one of the leading defence aerospace companies, Lockheed Martin.
Sign up for your weekly dose of what's up in emerging technology.
As part of the team, I worked on the Avionics system for the Aarush X1 and Aarush X2, where I had the opportunity to develop complementary skillsets – from avionics design and autopilot integration to GNC (guidance, navigation and controls) and communication system design. It was one of the most rigorous experiences which helped fortify my knowledge of algorithms, electronics and systems engineering. For me, I would say witnessing frequent crashes was equally memorable, as it taught me a whole lot about how to learn, rebuild and come back stronger than before. Had I not been part of this during my formative years, I would find it hard to appreciate the real-world systems powered by the algorithms and models I work on now.
“Talking about how you were inspired by seeing your first unmanned aircraft taking off in front of you is a cliche that has been done to death, but is definitely one of the pivotal moments of life.”
I joined the Robotic Systems Development Program at Carnegie Mellon University, where I had an absolute blast working on several projects ranging from developing Autonomous Social-Collaborative Robots in parking lots to building experimental path planning algorithms for extraterrestrial rovers, to developing Reinforcement Learning algorithms for Autonomous Driving agents looking to change lanes. Implementing seminal works in Perception, Planning and Machine Learning, which looked like a daunting task before, became a weekly ritual as part of the many courses and projects I undertook at CMU. During the summer break, I interned at Uber’s Advanced Technologies Group, where I worked on solving Path Planning problems for Autonomous Vehicles using machine learning. I joined Uber ATG full time after graduation and worked on machine learning for object detection, object association and tracking for the past 4.5 years.
AIM: What were the initial challenges, and how did you address them?
Shivam: One of the biggest challenges you would ever face is inertia. For any young robotics enthusiast, going from being passionate about a certain application to actually going and building their first bot/app/model is the hardest. Many brilliant ideas and passions do not bear fruit due to the inertia of either not taking the first step or not knowing how to begin. One thing that works for me is to read before executing; read about how people in the past have thought about the problem, what state-of-the-art solutions exist, what people tried that has worked and what hasn’t.
Learning from reading other people’s experiences is a method that I have employed successfully across multiple domains that helps me overcome the inertia of going and implementing things by making the task less daunting. The other quality is flat-out stubbornness. The whole process of building something, seeing it fail, and working long hours to get it to work eventually, builds an analytical mindset to designing, debugging, and deploying. As they say, there is no substitute for hard work.
“I built everything from fixed-wing UAVs, VTOL UAVs, self-balancing robots, autonomous antenna arrays.”
AIM: How do you approach any data science problem?
Shivam: The best way to solve a data science problem is to fixate on the problem, not on the solution. When you jump into looking for the best answer to the problem, you tend to bias yourself to solutions you have previously implemented – “when you have a hammer, everything seems like a nail”. Before trying to come up with novel architectures or approaches, ask yourself the following questions –
- Do I really need a machine-learned model for this problem?
- What are the gaps in the current analytical/classical methods of solving this problem?
- How do I design algorithms to go around these problems?
- What constraints does the problem need to satisfy (eg. Do I have labels?
- Are there runtime budgets? What are my compute/memory constraints?
- What biases are needed to solve the problem (What feature sets are crucial?)
- What input/output representations are needed? (Is the output a score, probability, or category?)
- What metrics am I trying to optimize? (F1-score mAP, L2 error, multi-task loss, etc.)
- What tools do I need to iterate fast? (Visualize failure cases, Task-specific losses, etc.)
Once you have invested enough time answering these questions, your likelihood of implementing something that fails after weeks of effort goes down significantly. You would end up with a better initial design, and even if your model or algorithm doesn’t work out-of-the-box, you would be in a better position to improve and iterate fast.
AIM: Tell us about your role at your current company? What does a typical day look like?
Shivam: I currently work at Aurora as a Sr. Engineer and a Tech Lead for the Perception Team. We are revolutionizing the logistics and transportation space, focusing on solving both autonomous trucking and passenger mobility at scale. Autonomous Driving is one of the most interesting problems in robotics due to several factors. One of the biggest challenges comes from operating in an unstructured environment with dynamic actors. This opens the door to some fantastic machine learning challenges – detecting static and dynamic objects at long ranges, through occlusions, across multiple sensor modalities, estimating state (position, velocity, acceleration, etc.), predicting future actor behaviour, actor interactions, trajectory planning, and the list goes on.
We need to develop solutions for each of these subproblems that are not only robust to perturbations but can also scale across different actor types, densities, temporal variations, geospatial variations, and other factors. I work on problems related to using deep neural networks for object detection and future state prediction. Some interesting works include a multi-view joint detection-trajectory estimator (MVFusenet) and a learned joint associator-tracker (SDVTracker). The typical workflow of a Perception ML engineer involves looking at current failure modes, proposing solutions (new loss, new architecture, new task, etc.), implementing new capabilities, training models on a distributed training platform and analyzing experiments. Once we have a model that accomplishes the task at hand, it undergoes extensive testing before being eventually deployed on the self-driving platform.
AIM: What does your machine learning toolkit look like?
Shivam: I currently use Python and Pytorch for model development and have used Tensorflow in the past. I also work with CUDA and TensorRT to optimize models for production.
AIM: Which domain of AI do you think will come out on top in the next 10 years?
Shivam: The landscape of ML has drastically changed in just the last 10 years. If in 2011 you were implementing a perception system for an autonomous system, it would be highly impractical to suggest a multi-view multi-sensor convolutional network that incorporated recurrent neural networks for temporal consistency. Fast forward 10 years, “the impractical” has become “the obvious”. In another 10 years, I expect the machine learning and AI industry to take the next big jump on the shoulders of revolutionary changes in compute and data.
Breakthroughs in quantum computing or photonics could enable the development of bigger, more data-intensive approaches. Couple that with improvements in energy-efficient high-performance computing and cloud computing advancements, and suddenly models that would take days, if not weeks to train, could finish training in a few hours. All this, and we barely have had the chance of mentioning improvements in communication and networks that have already made it possible to run beefy ML models in the cloud and serve results on mobile computing platforms.
It is really an exciting time to be a machine learning practitioner. As to the question of which domain will come out on top, I feel that all domains, whether it be Computer Vision, NLP, AR/VR, Autonomous Driving, etc., will benefit from the next breakthroughs in compute and memory. This is evidenced by the fact that all these fields have simultaneously shown strong performance gains in the last decade, primarily due to the cross-pollination of ideas.
“Most interviewers are not always looking for just the right answer. Interviewers are more interested in whether a candidate can back his answer with solid reasoning and concepts.”
AIM: What would your advice be to aspirants who want to crack data science/ML roles at your company?
1| Learn: Some suggested reading includes:
a. Machine Learning: A Probabilistic Perspective (Kevin P. Murphy)
b. Deep Learning (Aaron Courville, Ian Goodfellow, and Yoshua Bengio)
c. Understanding Machine Learning: From Theory to Algorithms (Book by Shai Ben-David and Shai Shalev-Shwartz)
2| Have clear concepts of probability theory.
3| Have strong software skills: Knowing your data structures and algorithms goes a long way.
4| Think deeply, think innovatively: Think about the problem before the solution. What has been the go-to way of solving this problem, and what are its shortcomings?
5| MVP Focused: When proposing a solution, always think about the simplest solution that can give you “signal” the fastest.
Instead of implementing a complex solution that would take six months and could potentially fail, think about how to break it down into simpler measurable pieces that could give step improvements during implementation.
Today, most interviewers are not always looking for just the right answer. Interviewers are more interested in knowing if a candidate can back his answer with solid reasoning and concepts, and if given evidence to the contrary, can make sense of the new information and come up with logically sound improvements. So do not fixate on getting to the final answer; make sure you are explaining your thought process and backing your solutions with solid reasoning.