MITB Banner

Netflix federated graph architecture: How is it searchable?

The Studio Search platform takes a portion of the federated graph and makes it searchable.
Share

In 2020, Netflix announced their Studio API and GraphQL Federation to overcome the complexity of developing the API aggression layer. Netflix explained the two-fold goal of GraphQL Federation is to provide a unified API for consumers and give backend developers flexibility and service isolation. 

Essentially, the architecture of a GraphQL is to provide a gateway layer that brings together different federated services into one unified API endpoint. Further, it allows teams to independently build and operate their domain graph services while connecting the domains under a unified GraphQL schema. This is why the Content Engineering at Netflix has been transitioning several services to use a federated GraphQL platform. 

The problem

Netflix explained the three core entities of the graph as movie, production and talent. Separate engineering teams independently own these:

  • Movie: Every title in the data is a Movie object.
  • Production: Each Movie is associated with a Studio Production. The production object tracks factors integral to making a Movie, such as the shooting location, vendors, and more.
  • Talent: These are the people working on the Movie, like the actors, directors, etc.

These entities are made available in the graph. Following this, the teams need the ability to query for a particular entity based on attributes of related entities. For instance, they may want to search for all movies currently being produced with actor Ryan Reynolds. This is the problem of making a federated graph searchable. As a solution, Netflix created Studio Search.

Netflix Studio Search

The Studio Search platform takes a portion of the federated graph and makes it searchable. The portion is a subgraph rooted in an entity of interest that would be queried with text input, filtered, ranked, and faceted. This was done with Elasticsearch as the underlying technology. 

The team considered three essential factors in building their index pipeline. 

  1. The definition of the subgraph of interest is the primary search
  2. Events that notify the platform of changes to entities in the subgraph
  3. Index specific configuration 

“In short, our solution was to build an index for the subgraphs of interest. This index needs to be kept up-to-date with the data exposed by the various services in the federated graph in near-real-time,” said the Netflix engineering team. GraphQL enables this by providing a straightforward way to define the subgraph through a single templated GraphQL query. The query pulls all of the data the user is interested in using their searches. 

Architecture

The ‘Change Data Capture’ events trigger the reindexing operation for individual entities when they change, ensuring the index is up to date. Netflix’s teams leverage Netflix’s CDC connectors. Consequently, the federated graph would fetch all the data indexed, making the factors needed in the events an entity id. When substituted into the GraphQL query template, this id fetches the entity and related data. Elastic search, the underlying technology, is used to create text analysers that can be generalised across domains. The type information from the GraphQL query template and user-specified index configuration are leveraged.

Finally, a Data Mesh pipeline is created based on these inputs. A more recent concept, data mesh is decentralised architecture, marking an organisational shift in the way enterprises manage big data. It solves the challenges of lack of ownership of data and lack of quality data and removes bottlenecks to encourage organisational scaling. Data mesh treats the data as a product. Each source has a data product owner, who could ideally be part of the cross-functional team of data engineers. At Netflix, the mesh consists of the user-provided CDC event source, a processor to enrich those events using the user-provided GraphQL query, and a sink to Elasticsearch.

Image source: Netflix

Netflix’s Studio applications produce events to schematised Kafka streams within Data Mesh in the architectural process. It does so in two ways. First, it processes by transacting with a database monitored by a CDC connector that creates events. Alternatively, it directly creates events using a Data Mesh client. These schematic events are consumed by Data Mesh processors implemented in the Apache Flink framework. A union processor is used to combine data from multiple Kafka streams when entities have multiple events. Next, a GraphQL processor executes the user-provided GraphQL query to fetch documents from the federated gateway. In turn, the federated gateway fetches data from the Studio applications. Lastly, the information fetched from the gateway is put onto another schematised Kafka topic. Then, it is processed by an Elasticsearch sink in data mesh, indexing it into Elasticsearch index – configured with an indexing template personalised for the fields and types present in the document.

Reverse lookups

Movies are always changing, be it production or talent detail change. The reverse lookup feature is a solution to update the model when a change to a related entity is made. The model looks up all the primary entities that could be affected by the change to reflect the alternations. It then triggers events for the entities. The model does so by consulting the index and querying for all primary entities that may be related. The pipeline also observes a change to the Production, “with ptpId “abc”; we can query the index for all documents with production.ptpId == “abc” and extract the movieId,” the team explained. Later, the movieId is passed down into the rest of the indexing pipeline. The team also revealed this feature to be convenient yet not user friendly. 

PS: The story was written using a keyboard.
Share
Picture of Avi Gopani

Avi Gopani

Avi Gopani is a technology journalist that seeks to analyse industry trends and developments from an interdisciplinary perspective at Analytics India Magazine. Her articles chronicle cultural, political and social stories that are curated with a focus on the evolving technologies of artificial intelligence and data analytics.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India