Codemia | Master System Design Interviews Through Active Practice

My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach with Score: 8/10

by realm9991

System requirements

Functional:

users can create a short url for any url
users can request a custom short url
url mappings last up to 1 year
users can specify an expiration date for their mapping
we can see how many times a short url has been loaded/used

Non-Functional:

high availability
eventual consistency
url durability
redirection is fast

Capacity estimation

estimated 100k new urls made per day
estimated 100:1 read:write ratio
therefore 10M reads per day
one record of (tiny url, long url) requires roughly 1kB of data we'll say
1kB * 100k records = 100MB of data created per day. that's 36.5GB/yr of data. Need to maintain roughly that much data at any given time

API design

create(longUrl, ttl?) - this operation assigns a random short url to a long url. optional time to live
create(longUrl, desiredShortUrl, ttl?) - this operation assigns a specified short Url to a long Url if it's available. optional time to live
read(shortUrl) - this operation happens when a user tries to load up a short url. we look up the long url associated with the short url and redirect the user to that page

Database design

We have two big options here - SQL or NoSQL. We don't have huge data storage needs at just 36.5GB of data. That means we don't likely need to partition our data, but replication will be important.

If we have 100k new urls per day then that's ~1.2 new urls per second and 12 reads/s. One instance of a DB should be able to handle that many writes on its own. That means we can support a single leader database model.

A single leader database would be ideal because we have the possibility of two users requesting the same tiny url at the same time. If that were to happen in a multi leader config then it's possible two users would be told they got their desired tiny url when in reality that's impossible.

Our access patterns are basic. We need to be able to write new records, fetch records based on tiny url (primary key), and to be able to update another record with the view count. I'll get to that bit later.

A key-value store like Dynamo would work, but it uses a multi-leader model with eventual consistency. Strong consistency is preferred here. For that reason I'd suggest a SQL database with ACID transactions like MySQL or Postgres.

We'll use a single-leader MySQL setup where the leader handles all writes and our replicas handle all reads. We'll synchronously replicate our data before responding with success on writes, because our users will tolerate a greater latency on writes than on reads.

High-level design

Our clients will make a request and first reach our API Gateway. The API Gateway will handle authentication, SSL termination, and routing to either our write or read service.

We'll separate tinyUrl creation and tinyUrl lookups into two services, which I'll refer to as our read and write services for simplicity.

We'll run multiple instances of each service for redundancy. They'll be stateless so as instances fail we'll spin up new ones to replace them.

We'll also use a cache for reading urls. We can assume 20% of our urls will make up 80% of traffic so we'll need about 7GB of cache space at any given time. Memcached seems ideal for its fast performance and since we don't need rich data structures like Redis supports.

As mentioned we'll have a mysql or Postgres database setup with one leader and several replicas for handling reads. When the leader dies they'll elect a new leader and spin up a new replica.

We'll have a batching service that will track views in memory and batch them. Then on a predefined interval it will write the batched views to DynamoDB.

To record views we won't use the same database as we do to store url mappings. We'll have 12 writes/second on average, or more if we grow or have traffic spikes, which could slow down our MySQL db. Instead we'll use DyanmoDB for our writes. Dynamo supports eventual consistency which is fine for our use case.

We'll have a cron job that runs as our cleanup job. This will remove tinyUrl mappings that have expired from MySQL and it will also remove view counts for the tinyUrl in case we reuse the tinyUrl.

Request flows

Write flow:

When users request a random tinyUrl they'll hit our API Gateway. The gateway will do path based routing and send the request to our write service. The write service will generate a new url and store the mapping in the db. The leader will replicate it to all the read secondaries and then return success when that's completed.

If a custom tinyUrl is requested then the flow will only be slightly different. We'll check the database for the tinyUrl and if it exists we'll return an error. If not, then we'll complete the flow as usual.

Read flow:

When users request a tinyUrl they'll hit our API Gateway and then be redirected to our read service based on the path. We'll then check our cache. We'll use a cache-aside/lazy loading approach where our service will first check the cache and then check MySQL on cache misses. This will speed up our requests.

Before we return the redirection url we'll send a POST request to our batching service to record the view. The batching service will maintain an in-memory map of { tinyUrl: views }. On a predetermined interval it will write all the in-memory records to DynamoDB. This will help reduce the write load on Dynamo.

It's possible there will be conflicts in DynamoDB, so we'll resolve them by adding the views to reach a correct, but eventually consistent, view count.

Detailed component design

We'll want to use a cache to speed up read latency. In this case Memcached is a great solution because it's basic and performant for key-value lookups like ours. We'll use a cache-aside read strategy for the lookups where our app first checks Memcached and then checks MySQL on cache misses. Missed values will then be loaded into the cache.

We'll use a combo of LRU and TTL for our eviction strategy, but we'll use empirical data to tune this. Users can optionally specify a TTL or we'll use a default of 1y so all records in Memcached will have a TTL. But, in case we fill our cache before any records expire we'll fallback to LRU. This will keep hot records in the cache and let cold ones fall out.

Our cleanup cron job will run every 30m or so to remove tiny urls from MySQL and Dynamo.

Trade offs/Tech choices

Using a single leader with synchronous replication write strategy allows us the strong consistency we desire but will cost us on the write latency. I opted for this because consistency is critical for protecting user data but a bit of increased latency on writes is acceptable for most users.

Our cache is another noteworthy choice. It will cost us more but should significantly speed up our read latency which is critical for users perception of our service.

Failure scenarios/bottlenecks

It's possible that our cache will fail and then leave us without cached records as we fill up the new cache. For this reason it makes sense to run multiple instances of the cache and to replicate data between them.

When our batching service fails we'll lose the views that are in memory. We can attempt to build in a graceful shut down to the application logic to dump this info to blob storage if our needs prohibit us from losing that data. We can also increase how often the batching service writes to Dynamo to reduce this risk.

If our MySQL leader goes down then we'll have the read replicas elect a new leader automatically. If we need to scale up we can look into partitioning our MySQL database. That way we could still have a single leader for each partition of the data. We could partition on ranges of userIds or ranges of tinyUrls depending on how we generate tinyUrls. We'd want to make sure we choose a partition key with high cardinality to avoid hot spots.

Our API Gateway could fail so we'll want to have a primary-secondary failover setup there where we have a passive secondary ready to go when the primary heartbeat fails.

If our read or write services go down then we'll take them out of the pool of services we route to from our API Gateway. These services are stateless so we can seamlessly redirect traffic. If a request to one of those services 500s then we can retry it once on a healthy node.

If our batching service goes down then we'll lose the records in memory. We could use application logic to attempt graceful shutdowns where we dump the views in memory to blob storage that we read on startup. We could also write to dynamo more often to reduce the risk of data loss at the cost of batching less records that could have been batched.

If our cleanup job fails to run then subsequent runs should be able to handle the backlog. We can also run extra cron jobs at periods of low traffic in the case of any rare excessive backlogs

Future improvements

I'd possibly use multiple cache servers with replication to reduce the impact of cache servers going down. I'd also consider increasing cache memory.