Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.

Non-Functional Requirements:

The system should have less latency.
The system should scale to handle millions of URLs.
The system should be highly available.

API Design

Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...

POST v1/shorten

{

url:

} -> shortened URL

GET v1/long/ -> long URL

Main estimation is that the number of GET API calls >> number of POST API calls. -> we can tradeoff some latency to POST to account for correctness over the GET endpoint.

We can have a delete endpoint, and an endpoint which encodes a URL for a specified period of time, but keeping it out of scope for now.

We can also have an endpoint for seeing site visit statistics but keeping it out of scope for now.

Statistics will be influenced by whether our redirect is temporary or permanent - we need to factor this and arrive at one after discussion with client.

High-Level Design

Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.

-- In a single geography -

Server - responsible for getting the long URL and encoding them.

DB -> a simple no-sql database containing information in two tables -

Short URL -> {long_url, visit statistics, metadata}
Long URL -> short URL

first collection will provide efficient lookup, while the second collection will lead to search before encoding a long URL. Alternatively, we can use DB specific nuances to reduce the memory footprint, ex - We can have just the first collection with a GSI on the long_url in DynamoDB.

Cache - serves popular URLs quickly. Invalidation occurs through decreasing popularity in a fixed key space. Can use LFU/LRU cache eviction policy.

Encoder - encodes long URLs into short ones. May use a hash based encoding with character truncation. To avoid collision, the short URL can be appended with a hexadecimal timestamp -> leading to unique short URL for every long URL. In case of multiple encoders being used, a unique machine identifier can be appended as well to ensure zero collisions.

-- multiple geographies -

Challenge becomes - ensure that no single long URL gets more than one short URL. Can be solved using multiple ways.

Leader follower model - read from any geography. Writes to one. Will lead to slower short URL generation times.
Asynchronous short URL generation - wherein each server in each geography generate the short URL on its own, but the response to the client is delayed pending eventual reconciliation.

Both ways lead to some latency on the POST endpoint. The DB will be eventually consistent and global replication will lead to a slight replication lag, but since we are not overwriting any entry, we will not get a wrong URL - either correct URL or no URL pending replication.

Detailed Component Design

Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.

Encoder - Use to convert a long URL into a short one. Can have multiple algorithms behind it. One would be to hash the long URL, take the first x characters. To avoid collisions, a hexadecimal representation of time can be appended to the short URL. In high scale scenarios, we can horizontally scale the service since this is stateless. However, we need to ensure zero collisions. In a parallel setup, it is possible for two requests of two long URLs to arrive simultaneously with both having the first x characters of the hash the same along with the same timestamp. In that case, we need to append a sort of machine/instance identifier that ensures zero collision.
Server - Checks cache and on a miss, checks the DB for GET pathways. For POST pathways, checks the DB, if hit -> returns the short URL, if miss -> passes request to encoder service. Can scale horizontally since it is stateless.
Cache - can have a LRU/LFU eviction policy. Serves popular URLs quickly. Can use redis for this.
DB - -> a simple no-sql database containing information in two tables - Short URL -> {long_url, visit statistics, metadata}. Long URL -> short URL. First collection will provide efficient lookup, while the second collection will lead to search before encoding a long URL. Alternatively, we can use DB specific nuances to reduce the memory footprint, ex - We can have just the first collection with a GSI on the long_url in DynamoDB.
In case of huge volume -
1. Can spawn multiple servers - since they can scale horizontally.
2. In case of spiky loads - ex. a new movie has just came out - can add a queue to the POST endpoint, the GET endpoint will be served through cache. In case of a huge demand for popular URLs, CDNs should be used to reduce the load on the server as well as to reduce last mile latency.
3. This depends a lot on the traffic pattern to the service.
Latency is reduced through cache and CDN.
The input layer - the server - should also check whether the given URL is correct or not. A well-formed regex can be used.
The storage layer is used for storing the mapping from the short URL to the long URL and is replicated to avoid data loss.
The TTL functionality can be added by adding it to the DB and the cache. Soft/hard deletion can be used based on the choice of technology used. In case of soft deletion being supported, we can limit the growth of data by running a periodic deletion process which removes stale entries.
To avoid overwhelming the system, we can use a rate limiter.
We have multiple instances of the encoder being run. In case one is down, another can be used. Since the encoder is stateless, an instance going down would not cause much issues. In case all the encoder instances go down - we should have a fallback and lightweight algorithm that can run in the server itself to create short URLs.