“I only believe in statistics that I doctored myself.”
For this week’s ML practitioner’s series, Analytics India Magazine (AIM) got in touch with Ingo Alzner, who manages the Data Lake project at Porsche. Alzner has played an instrumental role in establishing Porsche’s big data ambitions. Currently, he is part of the cloud initiative (as Porsche representative) designing Volkswagen’s IT cloud strategy. In the last few years, Alzner built Porsche into a data-driven company. We tried to find out how Alzner and his team are embedding cutting edge data principles into the daily workings of a traditional but innovative automaker like Porsche.
Alzner majored in data analytics from Hochschule der Medien, Stuttgart. Additionally, he did a semester abroad at Udayana University, Indonesia, where he studied international business and learned a lot about Asia. Since his schooldays, Alzner has been fascinated with analysing & building charts. Alzner swears by the Winston Churchill adage: “I only believe in statistics that I doctored myself”. Alzner has also interned and worked at Horváth & Partners Management Consultants, where he learned a lot about collecting, transforming and presenting data.
When asked about the challenges he faced, Alzner quickly pointed out data quality. “It was mostly the same challenge as today – data quality. Most of the time is being spent to get the data in the right format and quality. 10 year ago, most of the data was stuck in Excel tables and there was a huge amount of work cleaning it and loading it into databases. Today, there is much more data, stored in big data systems, but data quality still remains the challenge. And without proper data quality, you are not able to analyze correctly,” he said.
Last year, Alzner worked as an AI delivery manager for the [email protected] programs enabling and coaching AI product teams like sounce.io, chatbots and car data use cases. Since the beginning of this year, Alzner switched his role and now represents Porsche at the VW Group cloud transformation initiative. Together with colleagues from brands like Volkswagen, Audi, Skoda, etc. he oversees design and implementation of common cloud strategies. “The car data domain is not an easy domain. In lots of cases, the domains of engineering and data science are fluent and classical data scientists need to learn a lot about engineering until they are really able to generate value. Just imagine analyzing sensor data from hundreds of sensors from our high performance chassis. You first have to learn what most of the sensors are meant for. Hence the real experts are those who stayed a long time in the car domain and did not jump from finance to biotech to cars,” said Alzner
While approaching a data science problem, Alzner suggests trying to spend as much time as possible with the data owner. “This will save you lots of time, because he/she is mostly able to explain to you all the anomalies you might otherwise spend hours mulling over,” he added. Talking about the toolstack , Alzner said Porsche’s ML endeavours are fueled by Cloudera’s Data Platform, AWS Sagemaker for machine learning and with Python as their choice of programming language.
Alzner believes a good data driven company emphasises more on end-to-end collaboration of all business departments and transparency. According to him, the most valuable use cases are those which use data from multiple departments. Because, if data just resides in silos, its true potential can never be realised. This may sound trivial, but Alzner considers this a big challenge for enterprises. “With tools like a use case library or a data catalog you are able to get great transparency and new data scientists are quickly able to get an overview of what is going on. Otherwise, the same idea will come up a hundred times. At Porsche, we just anchored the learnings of the last few years and deeply integrated the data driven principles in our company strategy 2030,” he said.
When asked about the AI hype, Alzner said deep learning has been applied to almost every use case for a while now, often unsuccessfully. However, in the future, more differentiated use of the methods will stand out. “It was shown that rule engines are often good enough and deep learning is not needed in lots of cases. I don’t know which domain will come out on top, but I believe we will finally see autonomous vehicles on the road in the next few years,” said Alzner.
We asked Alzner about his go-to data science reading resources and he picked KDnuggets.com and hackernoon.com. He feels beginners will find them handy. Further, for those aspiring to get into data science and related fields, Alzner suggests it’s important to take the long view and spend time on understanding the contours of data science, but not at the expense of pursuing the specialisation one is passionate about. For a data scientist, knowing the algorithms or the BI tools won’t suffice. They should be aware of factors, which might appear extrinsic on the surface. For example, Alzner stressed on the importance of knowing GDPR if one wants to work in the EU countries.