Codemia | Master System Design Interviews Through Active Practice

My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach with Score: 8/10

by drift1945

System requirements

Functional:

The user can get a short Url with a long URL
User accessing the short URL will be directed to long URL
If the URL doesn't exist, it will return an error message.

Non-Functional:

Availability: The request should receive a response without error.
Partition tolerance: the system should still operate even if some messages are dropped due to the network between nodes
Consitency: Support eventual consistency

Capacity estimation

For Data storage:

Each long URL is around 100bytes
Each short URL is 10bytes
Around 100M URLs are generated per day

So the total data storage is around 11 GB per day

For bandwidth estimates, let's assume we have 100 million DAUs and 100 million URLs visited per day. So the bandwidth is around 100M * (110bytes) /86400 = 110kb/s

API design

shortenURL(String longUrl): return the shortened URL
redirectURL(String shortendURL): return the HTTPS 301 with redirect to original URL

Database design

here we use Mongo DB, as it's high performance and scalability.

The database contains one table: which might include fields:

Primary Key: Short URL: varchar(10)

Long URL: varchar(100), indices.

Optional: expiration date

Their relationship is 1:1.

High-level design

Request flows

The shortening request will firstly go to load balancer and then the load balancer will decide which server to go to, and then the URL shortening service will encode the URL and then add the shortened URL to database and cache.

The redirecting request will firstly go to load balancer and then served by retrieving URL service, then it will try to get the corresponding URL in the cache, if not found, then it will do the data base query.

If the requesturl doesn't exist at all, it will return a Not found 404.

Detailed component design

For the shortending service, we can use a-z, 0-9, 36 character in total, so, we use 32-based encoding.

For the Mongo DB database structure, we use key-value store form like, in this way, the shorten URL can find the corresponding long URL very quickly. And we will do sharding by the hash value of the short url, to minimizing the hot-spot problem. To solve the hash-collision, we can do a second-hashing. And we will use a Master-Slave mode to reduce the load and to avoid single point of failure.

For the Cache, we will use Redis, and use LFU as its eviction policy to try to reduce the load to database, besides that, we can add other fields like expiration time to better control its life cycle.

Trade offs/Tech choices

For database,

LSM-based database(e.g. Cassandra): greatly support write heavy system

B-Tree based database(e.g. MySQL): ACID, structured data but might have scalability for huge amount data.

So here we use NoSQL database MongoDB, although it's not strong for reading request but it has high performance especially for write heavy scenario.

And for cache, we use Redis, as it's key-value store and performs well as a cache level,

For scalability, I used load balancer, trying to distribute evenly the request to it service, and we could combine region-based and weighted round-robin for load balancing

Failure scenarios/bottlenecks

The bottle network maybe inside the shortening service, which might takes long time and resources, as we need a good encoding algorithm.

Besides that, due to our great request number, we need consider a good sharding policy.

Future improvements

I will try to optimize the efficing for shortening service and support more functionalities like editing shortend URLs and customized urls