Council Post: Key things to remember while building data teams

The AI team consists of  ‘an external team’ (a team external to the data team but part of the core AI team) that works closely with the data team and then there is the core data team itself.
Key things to remember while building data teams

Three factors determine the success of AI solutions–use cases, work culture and the teams. In my previous article, I have discussed AI use cases and how startup founders can integrate them. Once the wheels are in motion, the next step is to build a capable AI team to bring the idea to fruition. Culture is equally important to nurture a data-driven environment, but that’s for another article.

The AI team consists of  ‘an external team’ (a team external to the data team but part of the core AI team) that works closely with the data team and then there is the core data team itself. The diagram below is a very high-level representation of the key stakeholders and their role in the AI project. It broadly represents different stages in the AI project and the areas where different teams fit in. The ‘data team’ is omnipresent. We will look into each of these teams in detail. 

The external data team

The external team is spread across an AI project. The AI projects get initiated by the senior leadership with a high-level idea. For example, a CEO who wants to build a dynamic pricing system. A product manager or program manager sets vision of the AI solution and drives the vision by building a team around the solution. The design team then comes in with the plan of intervention and user experience, after which the engineering team architects and builds the whole solution. Data team is involved in both execution and ideation across the project lifecycle.

Credit: Author

Let us explore the roles of the ‘external data team’ in detail.

Client Managers/Account Manager/CEO/Product Manager/Sales: One of them usually initiates an AI project. They could kick start the solution and act as the project sponsor. They manage and update other stakeholders on the status of the project and offer an overall guideline for the requirements.

Product Managers: They are responsible for driving the vision for the AI solution. Along with establishing measurement criteria, they are involved in aligning the solution, getting sponsorship, leading and delivering the overall solution with other teams like engineering and design.

Their role is to tie the whole project and the stakeholders – developers, designers, and executives – together. The product managers are generally responsible for the design, the product-market fit, and getting the product out there. With the AI project (compared to other projects), there are a few additional responsibilities and challenges like managing unknowns, non-deterministic outcomes, new tools and technologies, and new infrastructure. 

Engineering Managers: They are responsible for architecting, building and maintaining specific technology components. They take care of the growth and developments needs of the team, think ahead and work towards the reliability aspect of the engineering systems. For an AI project, the engineering manager’s role would be to architect the solution and provide an infrastructure for the final AI solution and make sure integration works. Overall, they are responsible for the reliable execution of the AI solution.

Designer: Designated as the voice of the customer to the internal teams, designers build user experience for the products. They design the right experience for the AI solution – how the recommendations should appear, where the push notifications should show up or how a customer is informed about the estimated arrival time of the cab. Designers in an AI solution also suggest metrics and guard rails focused on user experience. Often machines can forget the spirit of customer experience and the designers have a role to make the solution more empathetic. They make sure customer experience is at the heart of the product or the solution the company is building.

Core data team 

The AI data pipeline, shown in the figure below, is a good way to represent the functions of various data teams. The data pipeline tracks the journey of the data from data ingestion to insights to prediction. The data teams are present across the various parts of the pipeline. We will see each of these roles in detail.

Credit: Author

Data Engineer: Data Engineering is responsible for plumbing data pipelines for high-velocity and high-volume data. Data Engineers architect the data processing systems to ensure usability and reliability. They also make the life of downstream data consumers a lot easier. In a nutshell, the team does all the work needed to collect, store and process the data. 

A data engineer should know common programming languages like Java, Scala, Python and Ruby. For data collection and ingestion, one must be well acquainted with Kafka, fluentd, logstash, etc. For storage tasks, knowledge of NoSQL, data warehouse, Amazon Redshift is highly desirable. Other required skills include competence in Spark, SQL, ELK Stack, etc.

Data analyst: The role of a data analyst is to ascertain how data can be used to answer questions and solve problems. Data analysts crunch data at scale and provide reports and dashboards to make decisions. They advise stakeholders on matters concerning certain data points and how they can be improved over time.

A data analyst must have a knowledge of SQL, data visualization tools, and business intuition.

Business analysts: Compared to data analysts, business analysts are more involved with the business. While data analysts prepare data in formats that can be easily analysed, business analysts apply their hypothesis and thought process, understanding of the business and the product to provide actionable recommendations. The latter use data to create business insights and recommend suitable actions an organisation can take. They work closely with others across the hierarchy to implement changes based on their findings.

A business analyst would require skills in programming languages like SQL, R, and Python. Generally, business analyst is a good role for even people from a non-tech background.

Data scientists: Data science is the automation of thinking. Some of the data science that happens at scale in an industry is nothing but applied research. Data scientists are analytical experts with a good grasp of the problem at hand, fundamentals of mathematics, programming skills and underlying data systems. They use industry knowledge and contextual understanding to deal with business challenges. Data scientists use machine learning to enable “micro-decisions” at scale, thereby producing multifold business impact

An ideal data scientist needs to have a thorough understanding of data, algorithmic knowledge, strong fundamentals in mathematics, programming language proficiency, and know-how of the deployment systems.

ML engineer: A machine learning engineer combines software engineering and machine learning. ML engineers take algorithmic models and apply them to large scale consumer data. An ML engineer needs to have algorithmic knowledge, strong fundamentals in maths, good level of programming skills, the know-how of applying systems at scale, execute algorithms at scale, and rewrite algorithms if need be.

A lot of skills associated with an ML engineer – algorithmic knowledge, fundamentals in mathematics, programming language proficiency, and knowledge of the deployment systems – overlap with that of a data scientist. Additionally, an ML engineer should be able to execute algorithms at scale and re-write them if necessary.

The skill map

Credit: Author

The above illustration depicts the skills required for each role. Double tick refers to strong expertise and a single tick is representative of basic knowledge. For example, a data scientist needs to be strong on subjects like Python and algorithmic knowledge; however, basic know-how of SQL and business intuition would suffice.

We could see overlap in the skills typical of roles within the function. This helps in providing career opportunities for people to move across various roles in the data function.

Wrapping up

They say it takes a village to raise a child. So is the case with an AI solution – there are multiple teams involved in building the solution. The trick really lies in constituting a team that understands the stakes, aligns itself well with the larger purpose and delivers to the best of its ability.

This article is written by a member of the AIM Leaders Council. AIM Leaders Council is an invitation-only forum of senior executives in the Data Science and Analytics industry. To check if you are eligible for a membership, please fill out the form here.

Download our Mobile App

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox