System requirements
Functional:
User submits a longURL, a shortURL is generated, stored in the DB, and returned to the user
User hits the shortURL, retrieves the longURL from DB, caches the shortURL and longURL in a distributed cache, then redirects to the longURL
User is able to share the shortURL through various communication channels (will need to integrate with other communication channels) - in this case, we will support messenger, discord, and twitter
Non-Functional:
Response time => in the milliseconds, aim for <100ms
Minimize latency in response time
Minimize delay between reading and writing data to ensure real time availability => High availability
Handles multiple concurrent requests => high throughput
Consider caching
---
Fast response time
Caching may be needed
High throughput
High availability
--
Since we need to have high availability, the tradeoff based on the CAP theorem is that we need to decide on consistency vs availability. Since we cannot have both, we will focus on availability over consistency.
Based on this, we will most likely consider a NoSQL database such as Cassandra in the database design - we will consider the tradeoffs first
data consistency
error handling
fault tolerance
Capacity estimation
Estimate the scale of the system you are going to design...
Read queries per second:
100 reads/second (normal traffic) => ~ 9 million reads / day
1 url per user / day
10,000 DAU => 10,000 writes / day
-----
10,000 writes / day
9,00,000 reads / day
900 reads : 1 write ratio
we will simplify this to 10:1 read/write ratio
-------
during spikes, we will see 5x the traffic
50:5 read/write ratio
API design
The system has to convert a longURL to a shortURL
action: convert longURL and return shortURL
input: longURL
/v1/action/convert_to_short_url
params: {string: longURL}
The system has to take a shortURL and redirect to longURL
action: convert shortURL to longURL then redirect
input: shortURL
/v1/action/get_long_url
params: {string: shortURL}
response: return a 301 or 302 http code for browser
301 vs 302
301 will treat every redirect as a brand new redirect where as 302 will treat the redirect as a known one if it already existed and retrieve that info from a cache
301 is good if you are tracking analytics on that specific url and you want info such as how often it is used for it
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
- solution should scale well for both reads and writes
- we don't want to lose data (fault tolerance, data integrity) during network failures or hardware faults
- recovering data in case of an outage
- high availability
based on the data model, we can choose either a nosql or sql database. a sql database would make sense given the simplicity of the object and attributes stored. we also want high availability which would work for a sql database such as nosql because we can set up read replicas since there are more reads than writes. the bottleneck would be the writes but since there are less writes than reads, it is a tradeoff we are making. for fault tolerance, we can also have multiple data centers in different locations and we can set them up based on locale to ensure that users will be hitting datacenters closest to them.
we also need consistent data in which having replication from the write master machine to replicas would ensure that. whereas with a nosql database such as cassadra, we may encounter data that is inconsistent.
considering scaling, we will have to shard the mysql database when we want to add additional machines. this is additonal complexity that we will need to introduce by having a proxying service that will handle the logic for sharding. we will also need to consider an primary id generation in order to ensure that there are no collisions. we will also need to introduce a service discovery service such as zookeeper that will help us determine which machines are up and healthy.
by having multiple datacenters, we also handle the case of hardware failures or network outages because we will be able to reroute traffic to a backup data center if necessary
in terms of leader/followers for data consistency, we can use a master/slave model and if the model goes down, we can promote a slave to a master.
all of this can be done internally by mysql and will not require additional services.
the bottleneck overall is the replication. there may be replication delay due to the data replicating to all the nodes. however, because we need to achieve consistency more than availability (we need to read the correct data), this is important
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design...
data model:
longURL, shortURL, created_at
we want to store the longURL and the shortURL so that they can map to each other and we want a created_at for later down the line when we may want to expire certain urls after a certain time
pros: fast writes
cons: during traffic spikes, there will be a lot of writes/reads which can be costly (as a note, something we can discuss further is during traffic spikes, we can add additional worker machines to process writes/reads)
since we will be doing more reads than writes, we should also consider potentially adding a cache when doing reads because the urls may be accessed more often (the ratio is about 10 reads to 1 write) - we will also discuss more of this later on
since we expect writes to be fast (less than 100ms) and reads to also be fast (100ms), we will have to process on the fly via stream data processing rather than batch data processing
----
will store individual urls via stream data processing
store for several days (or chosen TTL) then purge
will also have a cache that will cache urls after they are redirected for faster access
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
generating short url from long url:
user -> browser -> request -> api gateway -> short url generator service -> database access service -> database
redirecting short url request to long url:
user -> browser -> request -> api gateway -> database access service -> database -> get longURL -> redirection on server side -> browser -> user
user -> url -> sharing service -> fan out based on service -> discord | twitter | messenger -> send to service
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
components i will focus on:
- processing the request to convert a longURL to shortURL and the response back to the user
- processing the request from shortURL to longURL and redirection
to process the request to convert the longURL to shortURL:
user -> browser -> request -> api gateway (will also handle the validations) -> long url consumer (establishes and maintains tcp connection, infinite loop, deserializes the data) -> hashing queues (which will take in jobs for each request to be hashed, we can scale up by haviing mulitiple partitions in order to allow jobs to be parallelized), hashing service (also can have multiple in order to parallelize and increase throughput) -> aggregator (which will take all of the jobs completed after a threshold and flush to the db) -> in memory store for the aggregator (saves the jobs in case they are not saved to try again if db is down) -> database writer (reads and writes to db as a service) -> database
in case of any errors, we will have a retry logic to try again - in order to not slam the database, we want to do some jittering or exponential backoff, we can also add a timeout limit if it deosn't respond, if it reaches the timeout limit, we can write to the cache from the aggregator to try again later so we dont block resources based on a time, if it was successful, we will then flush the cache. after writing to the database, we will return the shortURL to the browser and back to the user
to process the request from shortURL to longURL and redirection:
flow -
user -> browser -> request -> load balancer (distribute load since we will have high amounts of read traffic especially during spikes)-> API gateway (will handle validation, rate limiting) -> query retrieval job queues (can use something like kafka which will be an event driven, domain based architecture, which will also allow checkpointing which will prevent data loss because it can be replayed)-> retrieval service (takes the short URL hash and hashes it to the right primary id for database retrieval) can have multiple partitions to ensure high throughput - will also allow for easy scaling when traffic is high, can also do this for queues -> retrieve from cache -> if not in cache, retrieve from database, then write to cache -> handle redirect -> browser -> response sent back to the user -> user
because this is a distributed environment and we want to maximize throughput there are many areas where we can add additional resources. we can do this with the distributed cache, load balancer, queues, and database.
because we are using MySQL, we will need to consider sharding when horizontally scaling additional databases. this means that we will need an application that knows about the multiple shards, handles the healthchecks as a discovery service such as zookeeper, and a shard service that will handle proxying to each shard. we will also need a snowflake id generator to ensure that ids are generated uniquely without collisions. one side effect of this is that we will also need to add a coordinator that will be able to handle the coordinator of ids given to each shard to ensure that in the case that the generator goes down, we will still be able to serve ids to ensure that it is highly available to avoid a single point of failure. we also want to ensure that the coordinator doesn't deal with stale ids. this means that we want the ids to be generally k sorted such that if it is passed a deadline, the ids will expire and they will need to be re-fetched from the generator. i suggest that to ensure high availability and be fault tolerant, we will use a UNIX socket between the application, id generator, and id coordinator.
because we are using MySQL, we want to consider replication strategies. we want to avoid replication delay so that means that we need to evenly distribute the load between shards. for load distribution, we can do something naively such as round robin. this isn't an issue for writes because each URL will only be written once. for replication, we want to set a replication strategy of master-slave. writes will only go to the master then it will be replicated to the slaves. the tradeoff of this is that the URLs will need to be written to before it can be retrieved. this is also a good strategy for failover in the case the master goes down, the slave can be promoted as the leader until the master is up again.
in order to achieve load balancing and ensure smooth data distribution across multiple servers in a sharded environment, we want to use consistent hashing. this is good for distributing data load and query loads evenly among the servers in a cluster, each database shard has its own hash value mapped in a circle (this is good for hardware failures where a database goes down, we wont need to remap all the data but only a subset), node removal/addition - also minimizes data redistribution, and in consistent hashing, queries are routed to the appropriate database node based on the hash value of the query key. this enhances fault tolerance by dynamically adapting to node failures and reduces the impact of adding and removing nodes from the cluster
For the design suggested, there are a few technologies to choose from. we will assume that budget is not a constraint and we are open to using a mix of open sourced and paid software services. for queue, we can use apache kafka as the event drive architecture, for distributed caching, redis is a good option. for the database, we will use MySQL.
in order to ensure strong interfaces and to avoid dependencies on each other, we want each of these to be microservices.
to ensure fast look ups along with a cache, we can also implement bloom filters to help make caching and look ups faster.
Trade offs/Tech choices
Explain any trade-offs you have made and why you made certain tech choices...
database tradeoff
MySQL vs cassandra
Mysql has ACID which is good for data integrity in the database. It also ensures consistency inherently whereas Casasndra is an eventually consistent database. Because we want to ensure that the URL returned to the user is accurate at each lookup and generation, we need to ensure high consistency. Although Cassandra is good for high availability, we can also mimic similar high availability in MySQl by using consistent hashing to handle load, fault tolerance, and also implement sharding to handle large data. By having multiple data centers, we can ensure that if one data center is down, the other will be able to serve traffic. We can also have multiple data centers such that users will be using the one closest to them to ensure low latency. Cassandra is a great option for high data volume because it is already sharded and makes adding nodes simple. We will need to shard the MySQL database in order to ensure that we have additional space for more data. This will come with additional complexity bnefcause we will need to know how to proxy requests to the correct shard and also ensure services are available by adding an additional service discover service. Because Casandra uses the gossip protocol, it doesn't need to have a service discovery service.
Apache kafka is a great option as a queue because it allows checkpointing to ensure data doesn't get lost and we can replayt from where we left off when there are network or hardware issues. It is also good for event driven architecture because we are pulling events from kafka and processing them in the processing service and querying service. We can use Redis to cache and because both can be partiitoned, we can easily add additional partitions to scale up when we need to. They can also live as their own microservices.
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
As mentioned with sharding, we will need to ensure that there are no ID collisions. This is really important to ensure data integrity. To do so, we will need to implement a id generator and coordinator to ensure that ids are passed to the service properly.
We also want to ensure there are no single points of failures meaning overall, we need to be able toa dd and remove machines easily. Load distribution is a big issue but by having microservices call each other and being able to add machines, it will be solved
Overall, this is additional complexity
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?
Monitoring and alerting