My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach with Score: 8/10

by luminous_sound

System requirements


Functional:

Generate shortened url

Redirects shortened url


Non-Functional:

Availability

Scalability

fault-tolerant


Capacity estimation


Assuming DAU 10 million and each user will generate 1 url per day, that's 10 m urls per day.


Assume each url takes 100 bytes it will be 100 * 10m = 1000m bytes = 1 GB per day.


That's 365 GB per year.


We can use Amazon DynamoDB: A fully managed NoSQL database that can handle a large traffic volume, ideal for scalability and performance.


API design


GenerateShortUrl - returns a shortened url based in input url.


RedirectShortUrl - redirects the shorten url to the original url.



Database design


Cassandra has certain operational overhead and we have no need for complex queries.


Given that storing url doesn't require complex relationship we can use a no sql key value store like dynamoDB.


{ "exampleofaveryveryveryverylongurl.com/test/1/2/3/4", "short.xyzab.com"}



High-level design


API Gateway provides DDoS protection, TLS termination, and forwards requests to right service nodes.


We can employ two primary services.


One for url shortening service. This service will be responsible for shortening the url based on algorithm such as base62.


we can employ another service call redirect service. This service will be responsible for redirecting the url.


The reason for this separate is because this service would be read heavy and thus we should optimize this service for read.




Request flows


user creates shorten url.

key value gets stored into the dynamo db.

user request shorten url.

redirect service redirect to original url.



Detailed component design


Like mentioned earlier we will use base62. First we will generate a uniqueID using sequential counter and then convert this to base62. Since it's sequential there wouldn't be collisions.


Given that this is a read heavy application we can add a cache layer between redirect service and Dynamodb.


For really really popular URL such as facebook.com we can use a dedicate CDN for that.


Trade offs/Tech choices


the sequential counter approach may not scale well. If we deploy multiple instances of our service that all access the same database table to increment the counter, you may end up with race conditions and collisions.


We can use snowflake ID instead. There might be a low chance of collision, so it would still require look up or appending some kind of timestamp or sequence number


I would go with using the latter approach.



Failure scenarios/bottlenecks


collision is one possible failure scenario and this was talked about above.




Future improvements


One future improvement is to explore other shorten url generate techniques such as random string generation.