System requirements
Functional:
List functional requirements for the system (Ask interviewer if stuck)...
user can generate a shortened url using a long url
shortened url will redirect to a long url
custom urls can be used
shortened url is 7-8 characters long
Non-Functional:
List non-functional requirements for the system...
100 million users per day
user makes on average 5 requests per day
500 million requests per day
Capacity estimation
Estimate the scale of the system you are going to design...
10:1 read/write ratio
5,800 URL shortening requests per second
58,000 shortened URL reads per second
estimate shortened URL to be 10 bytes and long url to be 20 bytes
30 bytes on average per mapping for 500 million write requests is 15GB of data per day and 5.5TB per year
peak traffic is maybe 58,000 reads per second * 5 = 290,000 per second
API design
Define what APIs are expected from the system...
POST /api/v1?longurl=longurlstring
GET /api/v1/shorturl
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
use nosql database because there is no complicated join logic that needs to exist. A nosql database would be easier to scale for this use case.
A schema might look like so:
id: uuid
longurl: string
shorturl: string
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design...
we should split our reads and writes into separate services, a read service and a write service. This way we can scale up our reads independently of our writes.
We should put these services behind a load balancer to be able to distribute the traffic and we should also use an API gateway to rate limit and validate auth with a custom url request. We should use a cache to help speed up read requests since cache data is faster than the database but also to reduce load on the databases.
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
During a write, we would take the long url in the query params and run a hashing function to be able to convert the long url to something like base62. We would take a portion of the base62 that was generated and use that as the short url. We can then save a mapping in the database by generating a new uuid, adding the long url and the newly hashed shortul.
For reading, we would take a short url first look in our cache for a mapping between the short url and a long url. If it exists, return a 301 response to the long url. Otherwise, look in our database for that shorturl. If that shorturl exists in the database, redirect to the longurl with a 301 response and update the cache to contain this mapping. If the shortul cannot be found in the cache or the database, this would return a 404 request as this shorturl does not exist.
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
The read service and write service scale pretty well because they are independent and we can add more services to horizontally scale as we receive more traffic.
We also want to include telemetry and analytics. We want to be able to see how many users are using our service at any given time. We can get insight into anytime our service fails and for what reason and maintain a good and reliable experience for our users. This is our only real insight into what our system is doing so this is very important.
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
creating separate services for reads and writes was a tradeoff to meet growing traffic. This adds complexity as we could have many different services running and communicating to handle requests but also is necessary for the scale we are dealing with. We added a no sql database as it is easy to horizontally scale, the downside is if we had a need for more complicated joins or transactions but for our use case, there arent really tradeoffs with this approach. We also decided to use caching to speed up requests and reduce load on our database. The risk here is we could have issues cache invalidation, pretty much receiving stale data.
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
cache invalidation, api gateway malfunctioning, failure to redirect to URLs that might not exist any more. System outages. Data loss
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?