MITB Banner

What Can Data Scientists Learn From Core Engineering Disciplines

Core engineering disciplines are more mature in adopting processes and engineering best practices which helps scale MVPs to large scale implementations.

Share

“Everyone you will ever meet knows something you don’t”

Bill Nye

Today is the day to cherish the contributions of Bharat Ratna awardee Sir Mokshagundam Visvesvaraya to the field of engineering and education. Every year, since 1968, when the Indian government declared Visvesvaraya’s birth anniversary as Engineer’s Day — we acknowledge our engineers’ role in nation-building.

Instead of just remembering and celebrating the day as it is, the time calls for learning, learning from various engineering disciplines. This need can be felt much more in essence for data scientists, touted as the hottest job of the 21st century. But, beyond the hype, the fact of the matter remains – it’s still too difficult to land a job as one. So, we reached out to prominent engineering experts for a well-informed perspective.

To begin with, one should have a basic understanding of why it matters in the first place. “An education in engineering is about training yourself to solve problems using a scientific and first principle approach. Engineering trains one to be forever curious and keep learning to get better. Even though professionals from non-technical backgrounds are successfully transitioning into a career in technical domains, engineers equipped with basic knowledge of computer hardware, software, programming, and other networking tools find it more convenient to adapt to this switch,” said Hari Krishnan Nair, Co-founder at Great Learning.

What one can learn

Solving complex problems requires a good mix of skill-sets to carry out work in data science. Bharat Raizada, Chief Technology Officer at Wells Fargo India & Philippines, shared three fundamental engineering principles to adopt:

  • Beware the broken windows: Core engineers are always aware while making design compromises. Like the real world, one broken window/bad design in the neighbourhood is okay, but multiple broken windows are slippery to a high-crime neighbourhood. Good engineers know it, and data scientists need to keep it in mind when designing solutions. 
  • Garbage in, garbage out: This comes straight out of the first program anyone compiled. Bad logic leads to bad output. Similarly, poor quality or biased training data sets will lead to a badly trained or biased machine, e.g. the MS Tay rollout.
  • Design by first principles: Core engineers have frameworks and abstractions to accelerate solution design. However, the most successful engineers can go back to the first principles, which will remain true for data scientists. A firm grounding in basic principles is critical. 

Teamwork is a crucial component in data science. Take, for example, a data scientist specialising in model development and optimisation but holding limited experience in data engineering. Someday, there will be a need to know about the technical features of data flow and architecture to complete specific tasks. Instead of hanging out on Stack Overflow and Quora, one can seek help directly from a team member with technical knowledge. 

Supporting this argument, Dileep Mangsuli, Head of Development Center, Siemens Healthineers, said, “Engineering recognised the importance of teams early on, and it got refined into the arts of project and people management. Data science is also a team sport. Additionally, standards have enabled engineering to evolve faster and build on the achievements of others effectively. Data scientists can also learn about setting standards that specify generally accepted professional practices from engineering. But, perhaps the most critical aspect data scientists can learn from engineering is establishing a code of ethics. Societies have funded the investments in engineering because of their confidence that comes from the benefits from earlier investments. A code of ethics for data science will help ensure the advances made are socially sustainable.”

Like engineering or manufacturing processes, it’s better to “fail fast,” as early detection can save efforts and resources. This becomes far more important also when it comes to the semiconductor industry. Tushar Vrind, Technical Director, System LSI Unit, Samsung Semiconductor India R&D (SSIR), said, “The role of data scientists has become essential to make decisions based on insights; however, to solve real business problems, they have to also be aligned with core engineering and product development teams. While engineering as a discipline is focused on applying scientific principles for building systems, the foundation of building a successful engineering team lies in repeating the feat within an estimated schedule, cost, and resources. One of the biggest challenges at hand that engineering teams in the semiconductor industry have is to capture the metrics defining these clearly and reusing them to forecast or accurately estimate the engineering efforts in the future when a similar or related project needs to be carried out. As a starting point, data scientists can collaborate with the engineering and product teams to discover these explicit and hidden features and accurately model the metrics for forecasting the schedule, cost, and resources.”

Better serve the end goals

While AI and ML have become mainstream, it’s time for the industrialisation of AI. This will shift the focus from technology to outcomes, empower learning and continuous optimisation across the organisation and turn data into predictions. Sandhya Balakrishnan, Senior Director, Data Analytics and Engineering, Brillio, says that the core engineering disciplines are more mature in their adoption of processes and engineering best practices which helps scale MVPs to large scale implementations. Three fundamentals we can abstract, customise and then apply to data science to truly industrialise AI would be:

  • Process best practices such as agile and testing rigour: These practices bring predictability to the product release and improved visibility to the iterations and complexities involved. Without these disciplines, data science projects can get into an endless number of iterations without clear acceptance criteria. Clarity of acceptance criterion will also ensure the product is not just focused on accuracy but real-world challenges of scalability.
  • Engineering best practices such as DevOps: Adoption of ModelOps, an abstraction of DevOps for data science, can significantly help break silos across the model management lifecycle and ensure that valuable data science bandwidth is spent in new model build rather than process/integration challenges or model monitoring/management challenges to keep models current.
  • Change management best practices: Design thinking best practices adopted by product organisations can significantly help demystify data science, build transparency and explainability, and thereby drive the adoption of data science. 

Continuous learning is key to our existence. Active learning matters more than sitting idle or getting stuck with one’s expertise. Simply put, “better learn, or perish” is an apt phrase no matter your domain, and a collaborative approach is always a saviour.

Share
Picture of kumar Gandharv

kumar Gandharv

Kumar Gandharv, PGD in English Journalism (IIMC, Delhi), is setting out on a journey as a tech Journalist at AIM. A keen observer of National and IR-related news.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.