Codemia | Master System Design Interviews Through Active Practice

My Solution for Design Typeahead Suggestion with Score: 6/10

by zephyr_cosmos528

System requirements

Functional:

give the user 10 suggestions that's most popular, sort desc by popularity
as we build up the user history, we should return the customized suggestions
words are in English

Non-Functional:

latency - low latency, the suggestions should be returned as the user types in real time
availability - highly available
consistency - can be relaxed a little bit, it's acceptable if the popularity is updated with a delay

Capacity estimation

assume 100M DAU, average query length is 15 characters, we store 100 personalized words for each user.

TPS: 100M/24/60/60 = 1000 TPS

Storage: 100 * 100M * 365 days * 15b = 5475GB

API design

get_customized_suggestions(user_id, k, prefix)
get_suggestions(k, prefix)

Database design

We don't require strict consistency and the data is not super relational, and consider the data size we can use NoSQL.

General list

word
frequency
created_at
last_updated_at

Personalized list

user_id
word
frequency
created_at
last_updated_at

High-level design

In order to return customized suggestions, we store the general words and words by user in different tables. The general service will read from the general words table while the customize table will read from the words by user table from the database. Word service query both the general and customize servers, and combine the two lists together, returns the top k words to the user.

Each user query will go to the aggregation service to update the frequency in database.

Request flows

The client sends a query to LB, the LB distributes the request to a word service. The word service queries the general service to get the top k general words, and queries the customize service to get the top k customized words. It then combines the two lists and returns to the client. Afterwords, this query is sent to the aggregation service, saved to the aggregation database temporarily. A set of workers read from the aggregation database and update the main database/redis nightly.

Detailed component design

Trie

Find the top k words with a prefix from the database directly is inefficient, so we'll make use of the trie data structure.

a-p-p-l-e

-i-c-a-t-i-o-n

This Trie gives three choices: app, apple, application, when a user types the first letter 'a'. It quickly cuts off possibilities of any words starting from other letters.

On service boot up, the general services will read all general words, and build the trie in memory. If the data size is too large, we can put the general service on a consistent hash ring, managed by a zookeeper so each service handles a set of prefix. Same as the customize services, each service can store the trie for multiple users.

Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

Future improvements

The general word list probably don't change that often, it's possible to put that on a CDN, and have an offline job to update it nightly instead of creating a set of general services.