System requirements
Functional:
List functional requirements for the system (Ask interviewer if stuck)...
- Support 50 billion links
- Short link generator
- Expiration date
Non-Functional:
List non-functional requirements for the system...
- User Authorization to links
- Checking for duplicates, reduced storage
- User's saved short links
Capacity estimation
Estimate the scale of the system you are going to design...
On average 1 link = 32 characters = 32 bytes. If 1 user converts ~5 links per day with expiration of 30 days, each user would need 32*150= 4800 bytes per month or ~4.8 KB. To support ~1 Billion users, we would need 4.8 billion KB or ~4.8 TB per month.
API design
Define what APIs are expected from the system...
Create TinyURL --> (URL) --> returns new TinyURL,
updates DB with { tinyURLId: UID, URL reference: string, creationTimeStamp: Date } 200 response
Get TinyURL (URL) --> return TinyURL
retrieves from DB using URL { tinyURLId: UID, URL reference: string }
Delete TinyURL (URL) --> returns 200 response of successful delete
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
We need 1 database for the functional requirements -->
TinyURL DB:
tinyURLId: UID
URL reference: string
creationTimeStamp: Date
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design...
We need load balancers to the application, which will talk to DB and give us the new tiny URL that we need. We need high availability, as these entries shouldn't collide with each other, and no need to keep high consistency.
We can also implement a cache, that can server as fast access for most commonly retrieved TinyURLs
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Clients will try to connect to application server that will process tinyURL request. It'll connect to a load balancer that will send client to an applicationServer that is available to handle the request. Once client is connected to application Server, they can process request to create new tinyURL or get existing tinyURL link.
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
The Database will be scaled such that it will have high availability but not as reliable consistency as a trade off. Due to higher volume of incoming requests coming in, most entries that will come in will be creating new entry in database and retrieval. Keeping the DB highly available would be in best interest so that customers will be able to generate TinyURL.
LoadBalancer will be essential to send to an application server that can process request in efficient amount of time when application server will be available to take in a request.
We can also have a method that scrubs every day for database entries to be deleted when there is a 30 day limit, deleting entries and freeing up memory.
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Tradeoffs could include deleting existing entries versus storing them. Advantages to either would be either a trade off a slower service (as we need to scrub data every 30 days) or increasing data storage to keep larger amounts of storage with TinyURLs.
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Failure scenarios could happen
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?