System requirements
Functional:
- Generate alias for a URL
- Update the original URL without changing the alias
- Get original URL given the alias
- Delete an alias
- Alias can be activated or deactivated
- on demand
- at scheduled time
- indefinitely
- Get original URL can be performed by
- Any user
- Authorised user(s) or user group(s)
- Update or delete alias can be performed by
- Creator of alias
- Authorised user(s) or user group(s)
- Insights
- Order aliases by traffic
- User should be able to customise the alias based on availability (in a premium plan)
Non-Functional:
- URL cannot be longer than 1000 characters
- One alias can refer to one URL but one URL can have multiple aliases
- Generating alias for the same URL multiple times should return different aliases
- It should not be possible to reverse engineer the original URL from the alias
- Service should be highly available because downtime would make URLs unaccessible
- Consistency can be compromised in the event of network partition
- A certain URL can be very popular and become hot
Capacity estimation
- Assume that 50 million users generate one alias/day.
- 10% of aliases take 90% of the traffic i.e. 5M URLs take 90% traffic.
- Each URL is hit 1000 times/day i.e. 5 billion requests/day i.e. 50 thousand/sec.
- On an average, an alias is active for 1 year. Hence, at any time, 365 * 50M aliases are required = 20 billion aliases. If an alias consists of a-z and 0-9 characters, we need 8 characters. For future scalability, let us take 8 characters.
- Write traffic = 500 TPS
- Read traffic = 50,000 TPS
Space requirements:
Average Orignal URL length = 100 bytes i.e. 1 kb
Average alias length = 8 bytes
20 billion aliases require = 0.1 kb * 20B = 2 Tb
API design
### Generate alias for a URL
POST /alias
Request:
- URL
- status (active/inactive) (optional)
- schedule
- activateAt
- deactiveteAt
Response:
- Alias
### Access control for an alias
POST /alias/permissions
Request:
- alias
- permissions
- operation
- actor(s)
DELETE /alias
POST /alias/activate or /deactivate
GET /alias
Database design
ALIASES table:
- id (primary key)
- alias (secondary index)
- URL (secondary index)
- createdBy
- createdAt
PERMISSIONS table:
- resourceId (primary key)
- actorId
- operation
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design...
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?