Summary of Reverse Searching Netflix’s Federated Graph

  • netflixtechblog.com
  • Article
  • Summarized Content

    Netflix Graph Search: Enabling Reverse Search with Elasticsearch

    This article delves into the innovative approach Netflix took to enhance its search capabilities within its federated graph, using Elasticsearch's percolator feature to implement a powerful concept known as "reverse search."

    • This development allows Netflix to search for queries that match a given document, rather than searching for documents that match a query.
    • This functionality has been integrated into Netflix's Studio Search, now renamed as Graph Search, and is utilized by over 100 applications within the company's engineering organization.

    The Challenge of Dynamic Subsets in Netflix

    Netflix faced a challenge in providing timely notifications to employees who were subscribed to dynamic subsets of movies based on specific criteria.

    • For example, a post-production coordinator might want notifications for movies shooting in Mexico City that don't have a key role assigned.
    • Traditional methods of tracking subscriptions based on individual movies were inefficient and didn't account for changes in movie data.

    Introducing Reverse Search: A Netflix Innovation

    Netflix's solution involved leveraging the power of Elasticsearch to implement a reverse search functionality.

    • Elasticsearch's percolator fields are used to index Elasticsearch queries.
    • Percolate queries can then be used to determine which indexed queries match a given document.
    • This "reverse" approach allows Netflix to identify the subscriptions that match a particular movie as it changes, enabling precise and efficient notifications.

    SavedSearches: A Key Component of Netflix's Reverse Search Implementation

    Netflix introduced the concept of "SavedSearches" to provide a user-friendly way to manage and persist reverse search queries.

    • SavedSearches are stored in a CockroachDB database and contain information about the filter criteria and the target index.
    • The filter criteria, expressed in Netflix's custom Graph Search DSL, are translated into Elasticsearch queries and indexed in a percolator field.
    • Reverse searches are performed by querying the percolator index with the relevant document, retrieving a list of matching SavedSearches.

    Versioning: Enabling Continuous Evolution of Netflix's Index

    Netflix employs a versioning strategy to allow for changes and enhancements to its indices without disrupting search functionality.

    • When an index definition changes, a new version of the index is created, along with a new pipeline for populating it.
    • Data Mesh, Netflix's data movement and processing platform, plays a crucial role in backfilling new indices with historical data.
    • Versioning ensures that SavedSearches remain valid and continue to function correctly even when index definitions change.

    Percolate Indexing Pipeline: A Complex System for Efficient Backfilling

    The process of indexing SavedSearches into a percolator index involves a multi-step pipeline that handles versioning, data transformation, and error handling.

    • CDC events from CockroachDB trigger the pipeline.
    • Events are filtered based on the target index.
    • Graph Search DSL filters are translated into Elasticsearch queries.
    • Elasticsearch queries are indexed in the appropriate percolate index.
    • Failed indexing events are sent to a Dead Letter Queue for further investigation.

    Beyond Notifications: Movie Matching with Reverse Search

    Netflix has extended the use of reverse search beyond just notifications, leveraging it for movie classification within the Movie Matching service.

    • Movie Matching uses reverse searches to determine which classification criteria a movie meets based on various attributes, including genre, region, format, and language.
    • This automated classification process eliminates the need for manual configuration of movie workflows, streamlining operations.

    A Glimpse into the Future: Subscription-Based Search with Netflix

    Netflix is exploring the use of reverse search to power more responsive user interfaces, potentially enabling search results to be delivered through GraphQL subscriptions.

    • Subscriptions could be associated with SavedSearches, and reverse searches could be used to update the results as the index changes.
    • This approach could significantly enhance the real-time nature of search experiences within Netflix's platform.

    Ask anything...

    Sign Up Free to ask questions about anything you want to learn.