My Solution for Design Typeahead Suggestion with Score: 9/10

by zinan

System requirements


Functional:

  1. Prefix matching
  2. Top 10 results - Most popular and relevant suggestions
  3. Upper/lower case insensitivity
  4. Alphabetic input only - System should filter out special characters and only return valid words


Non-Functional:

  1. High performance. The system should be providing type-ahead as fast as possible
  2. High scalability. The system should be able to handle large numbers of requests
  3. High reliability. In case of system outages or failure, the system should continue returing stale data rather than returning errors



Capacity estimation

When user typed in the keywords it's doing the read operation, as long as user completed the search and if the words absent in the dictionary, it's doing write.


DAU: 10 million

Each user make 10 queries per day, so total queries is 100 million daily

Assume each search include 20 bytes on average, each word is 5 bytes length, so one search would make 4 requests, so it would be 400 million requests made and 2 GB data sent daily


QPS = 4 * 100 million / (24 * 60 * 60) = 400 million / 86400 = 4630 / sec


If we storing the data for 5 years, we need 5 * 365 * 2 GB = 3650 GB = 3.7 TB


Also we can have the limitation of 200 bytes for the search box.


API design

Get /api/v1/suggestion?q=hello&topK=10



Database design

This is definitely the read heavy system and ElasticSearch is a fit for it.


Directly fetch data from database is never be a good idea, we can introduce a key-value store as a Cache such as Redis to store the data, speed up the request process and improve the performance.



High-level design

Show as diagram




Request flows

Client make a search request -> Cheeck the nearest CDN node to see if we can get the cached result -> If not, forward the request to API Gateway -> Ratelimiter will validate the request based on pre-defined algorithm -> if passed, request will be delievered to the suggesion service by Load Balancer -> Suggesion service would fetch the topK suggesions from Cache and return if record present -> if not, make the request to ES to get the result


Meanwhile, asynchronously publish the query data to Kafka such that Analytic Service can take it and persist the database.


Detailed component design

Suggestion Service - receive https query requests from the client, interact with Cache and publish the messages to Kafka for analytics


Filter Service - filter out special characters and only send valid params to the Suggession Service


Sync-up service - sync-up the topk suggesions from ES to Redis periodically


Analytic Service - consume the query requests from Kafka, aggregate and analyze the data and persist into the database, also responsible for pre-populate the data into ES


1.How to support top k results?

Both Redis and ES have the feature to support top K results.


We can use Redis ZSet to rank the suggesion list base on the click count. However, it does not offer the relevance ranking.


ES support ranking base on the relevance and several factors, so we can build a service to fetch ranking data from ES to Redis such that user can get the result from Redis directly.


2.How to make the end-to-end latency minimal?

ElasticServer providing fast reads but would be sufferring from the heavy read, so we have introduced Redis in the middle of database and backend service.


Redis should be also distributed deploy, such as deploy as Redis cluster to make it highly available.


Even we fetch the results from Redis it still not highly efficient if our DAU is very high. So we can pre-fetch some hot data in memory or implement some in-memory data structures like Trie Tree.


Trade offs/Tech choices

Get or Post for the type-ahead request?

  1. Get request have the best performance compare to post if the params not too long. In our cases we have limited the length of the messages that user can input, so we can use Get to have the best performance.
  2. The get url can be easily cached by browser or CDN, for example if we make: /api/v1/suggest?q=h, /api/v1/suggest?q=he, /api/v1/suggest?q=hel, the previous result can be returned by cache and avoid the server call



Failure scenarios/bottlenecks

If any service fail to respond due to outages, we can make the query always pointing to CDN and get the stale results rather than returning errors.


Future improvements

If the data keep increasing and performance is impacted, we can consider replicating the database, if still not be able handle the traffic we can sharding vertically or horizontally based on our business requirements.