System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

  • Users can post either binary or string contents to the service and get the unique URL back.
  • Users can share the content with others by using the unique URL.
  • Users can assign tags and TTL to the content.Other can only read this.
  • Users can change the saved content.



Non-Functional:

List non-functional requirements for the system...

  • The service should be high-availably and high-reliable.
  • The service should be scalable to load and can process high user peaks.
  • If users post their content, it may be visible in some 100 milliseconds.
  • Only the registered users may post contents, but everyone can read it.





Capacity estimation

Estimate the scale of the system you are going to design...

Let's suppose the users may post 1 Mbytes and DAU is 1 Million users.

The service would be read-heavy, let's count that every user reads the post five times a day and posts new content daily. So it needs to store 730 Terabytes every 1 year ( it counts the data duplication) and 3,6 Petabytes for 5 years.

It needs 5 Terabytes of bandwidth to guarantee the read for 1 Million users.



API design

Define what APIs are expected from the system...


post(apiKey, userId, content,contentLength, tags,TTL) posts the content to the backend and returns the unique key and URL, passing of the content and its length and optional tags and TTL of contents. The use of apiKey guarantees the prevention of abuse of the service.


getPost(apiKey, URL, tags) requests the contact by URL and tags and returns it. The use of apiKey guarantees the prevention of abuse of the service.


deletePost(apiKey,userId, URL) delete the posted content from the service. The use of apiKey guarantees the prevention of abuse of the service.





Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...


The service would use the eventual consistency to post the content. With such consistency, we may use AWS Dynamo DB.

We would have two tables User, Post, RateLimit


The table User:

id varchar (100 bytes)

email varchar (200 bytes)

createdAt Date (8 bytes)

blocked Boolean (1 byte)


The table Post:

id varchar (100 bytes)

userId varchar (100 bytes)

tags varchar (1000 bytes)

URL varchar (1000 bytes)

content array (1 Mega bytes)

content length 1 integer (8 bytes)

The table RateLimit:

apiKey varchar (1000 bytes)

timestamp date (8 bytes)


The table Post ties with the table User by userId. The table RateLimit contains the requests timestamp to build rate limit functionality.





High-level design


API Gateway provides DDoS protection, TLS termination, and routing requests to the right service nodes.

The main service is the Post service which caters the requests to publish or read posts. New posts are added to Kafka, then the Post service reads and processes it. The most requested posts the Post service stores in the cache. The Post service posts the post to AWS Dynamo DB.

The URL generation service generates the unique URL for new posts and stores them in the cache, and if the Post service needs to get a new unique URL, the URL generation service reads it from the cache.

CDN allows us to locate the posts closer to customers.

Caches may use LRU or LFU eviction policies.


Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...


The user sends a request to publish a new post and the API Gateway sends it to Kafka. The Post service reads this requests, and ask the URL generation service for a new URL and sends the ready data to AWS Dynamo DB.

When the user wants to read the posts, the API Gateway checks if it's to the closest CDN, if no, it puts the request to Kafka topic. The Post service reads this request and checks if it may be in the cache,if so the post would be returned to the user, otherwise, the Post service finds it in AWS Dynamo DB, stores in the cache, and returns the post to the user.






Detailed component design

Performance and scalability of the Post service is extremely important for this system. As such, we employ two levels of caching. Requests will naturally have locality of access, so caching will be effective. The Post service is stateless and we may use a few instances of this service. Also, we may use the Kubernetes to scale this service efficiently and make it fault-tolerant, because it's a critical part of the system. Kubernetes cares about service availability, and it's health, by ping the configured service endpoint. The Post service uses the cache, we would use Redis to store most requested posts, and we are going to use LRU to evict the stale posts. The Post service posts the posts to AWS Dynamo, it's a scalable and fault-tolerant key-value storage managed by AWS. The Post service reads the request from Kafka cluster. We would use Kafka partition replications to prevent losing post requests. When new post requests are coming up, the Post service asks the URL generation for a new unique URL.

To search posts by tags faster, we would build an index by tags in AWS Dynamo.

The URL generation service manages the unique URL. The URL generation service creates free URL and stores them in the cache. We would use the Redis as the cache. The cache has the configured capacity, so we don't care about data eviction. If Redis falls down, it's ok, if it loses the data, the URL generation service may generate new URLs.


At the closest location to the clients, we will have CDN storing posts and it decreases latency when the customer reads the post content.


Partitioning


The Post service and the URL generation service should be partitioned for improved scalability.

If we choose URL as the partition key for the Post service, it gives us even data distribution.

The URL generation service uses the cache with a partition and the partition key is URL.

AWS Dynamo manages the data partitions and we don't need any involvement.

The caches may use the URL as the partition key as well.




Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...


Posts are sent to Kafka

Before customer publishes posts to the backend, those posts are sent to Kafka. Kafka is an open source stream messaging platform that is a high-available and scalable solution to build a distributed system. Every post according to their key has been sent to the Kafka partition with a replication factor of 3. It makes the the system reliable and not loses the customer data, then Posts service consumes requests: if one is new post, it persists it to AWS Dynamo or it's the request to return the post by given tags or URL. If a customer saves a new post, the posts service asks for a new the unique URL to the URL generation service. Returned post would be saved to closest to a customer CDN, and at next time , the customer reads this post from the CDN directly to offload the Posts service. For most hot posts, the Posts service saves to Redis by using the unique URL as a key.


The URL generation service


There are two ways to create a short URL:

  1. Hash (e.g. MD5, SHA2) the long URL.
  2. Randomly generate it.


There is a tradeoff:


Pro of Hash approach is that you don't have to generate random numbers. Con is that the created hashes might collide. In particular, since our random string (8 characters) will be shorter than what the hash algorithms generate (20 bytes or larger), the risk of collision would increase.


Pro of random generation is the possibility of collision is lower. If a newly created random string collides with an already existing one, we can simply generate one more random string. Con is that it would require computational power to generate random numbers. However, since Linux and other OSes support fast random number generation with /dev/urandom, we assume the cost is manageable.


We will pick random generation in this exercise.


The Redis eviction policy

We would use the cache to store the posts and free-generated URLs. It exists such approaches LRU, LFU and FIFO.

It makes sense to use LRU.


The persistence of the posts to DB

AWS Dynamo is the excellent NOSQL solution. AWS manages it and we don't care about its scalability and availability.

The current domain model has the simple relation between entities, so AWS Dynam fits for it.




Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.


Fault Tolerance


All the components - Load Balancers, Web Servers, Cache and Database should have multiple instances for improved availability. There should be robust monitoring and alerting systems on them.


All nodes can fail. Let's look at important failure cases.


Failure in Posts Service

If Posts service fails (hardware failure, crash, software bug, network partition, slowness ...), it would directly impact the most time-sensitive operation of this system. To mitigate this, we should always run multiple the Posts service nodes. It is a stateless service, so we can have multiple nodes of the same service. We can use a coordination service, e.g., ZooKeeper, to track which nodes are alive (i.e. sending regular heartbeat to ZooKeeper), which are likely dead (i.e. not sending heartbeats for some time), which nodes should be taking requests.


Failure in URL generation service

If the URL generation service fails (hardware failure, crash, software bug, network partition, slowness ...), it would directly impact the most time-sensitive operation of this system. To mitigate this, we should always run multiple the URL service nodes. It is a stateless service, so we can have multiple nodes of the same service.


Failure in Cache:

Losing cache would also impact critical functionality. This makes requests much slower. Increased load on the database may even have cascading impact - database gets slower and slower, the Posts services retry, making the database even busier - ultimately resulting in the database crash.


We have multiple mitigations.


a. Create read replicas for Redis Cache. Let's say for 1 leader, we put 2 read-only replicas. Writes are handled by the leader, and propagated to read replicas by transmitting a write log. Reads can be handled by all three. If the leader goes down for some reason, one of the read replicas can become the leader (after a leader selection process) and take over the responsibility as the leader. This would avoid the aforementioned scenario.


b. The Posts service should have a mitigation strategy to avoid overloading the database. For example, exponential backoff before retrying, rate-limiting, and circuit-breaking.



Scalability - Posts service


Caching improves scalability on reads significantly. As the number of reads increases and pressures Posts service, we can increase caching capacity on both CDN and in the data center to serve more requests from caching.


Scalability - URL generation service


As the number of write requests to Poss services increases, it might put too much pressure on the URL generation services , causing slowness, errors, or even crashes.


To avoid this, we can introduce a message queue to buffer the requests. The URL generation service would push a message in Message Queue, representing the request. Queue Worker would pull from the queue, generates the unique URL and put it to the Redis. The system can inform the client with long polling.





Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?