Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.
Redirect to long URL when requested by short URL.

Non-Functional Requirements:

users reside in same country/region
low latency for redirect
URL creation may be slower
high availability of API

API Design

POST /url?long=

Response:

Status Created

Content-Type string

Body

GET /:short

Response:

Status Redirect

should redirect to long URL

Status NotFound

if long URL not found for provided "short"

High-Level Design

Components

Load balancer.

Manages upstreams to API service instances.

Requests are routed using consistent hash calculated from the request. It ensures effective use of app layer cache since same requests land on same API service instances.

API service.

Should apply cache-aside approach and store most recently used short URLs. It guarantees low latency for reads and therefore redirects.

On create, when new short url is required it should be searched for in the database or requested in Generator service. It will eliminate collisions.

Generator service.

It keeps some amount (like 10000) of "free" records in the database.

It can free existing records regularly if they are not accessed for too long. (But I won't describe this mechanism in details for now).

If there are not enough free records in the database it should create new short_urls in the ascending order and increasing the length of short_url when capacity is exhausted. Starting length is 8. So system starts with 2^64 free short urls.

It ensures that almost all the time when new short url is required it can be found with 1 query in the database.

Database.

Should be optimized for reading.

Sharding for scalability by short_url prefix (4 symbols)

Replication for high-availability (2 secondary replicas)

Key Non-Functional Requirements

Redirect Low Latency

Since users reside in same region we expect network is good enough.

Since main use case for our system is reading - low latency is achieved with cache-aside approach in API server. All active URLs should stay in app level cache.

High availability

Downtime of one API server instance is not a problem due to Load Balancer configuration - it should just route requests to alive instances.

Downtime of specific database nodes also should not be a problem since we set up replication and re-election of primary.

Scalability

We can increase number of API servers that will increase total cache capacity as well as number of simultaneous requests that system can process.

Adding more database shards should be possible if we use techniques like consistent hash in sharding, then only a small part of data should be migrated.

Detailed Component Design

# Use Cases

## Create

API Service receives long_url.

If a record for this long_url is found in database it is returned in response.

If no record found server retrieves any free record. To eliminate race it should be performed in atomic database operation.

If no free record is found API service requests Generator service for new record. This should not happen often so using single instance of Generator seems reasonable.

## Read

API Server receives short_url. Cache-aside:

It checks LRU cache.

If record is found then redirect happens instantly. We should tune cache to achieve desired low latency.
If record is not found API server queries the database (for performance and to avoid overloading primaries it may query replicas). In this case latency is higher, but it should happen only during API server cache warm up.

Either "Found", "Not found", "Internal error" response is sent.

# Database.

Record format: short_url, long_url, status, last_accessed

Sharding should be configured by prefix of short_url field.

Index [short_url_prefix, short_url] allows sharded and fast queries.

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.
Redirect to long URL when requested by short URL.

Non-Functional Requirements:

users reside in same country/region
low latency for redirect
URL creation may be slower
high availability of API

API Design

POST /url?long=

Response:

Status Created

Content-Type string

Body

GET /:short

Response:

Status Redirect

should redirect to long URL

Status NotFound

if long URL not found for provided "short"

High-Level Design

Components

Load balancer.

Manages upstreams to API service instances.

Requests are routed using consistent hash calculated from the request. It ensures effective use of app layer cache since same requests land on same API service instances.

API service.

Should apply cache-aside approach and store most recently used short URLs. It guarantees low latency for reads and therefore redirects.

On create, when new short url is required it should be searched for in the database or requested in Generator service. It will eliminate collisions.

Generator service.

It keeps some amount (like 10000) of "free" records in the database.

It can free existing records regularly if they are not accessed for too long. (But I won't describe this mechanism in details for now).

It ensures that almost all the time when new short url is required it can be found with 1 query in the database.

Database.

Should be optimized for reading.

Sharding for scalability by short_url prefix (4 symbols)

Replication for high-availability (2 secondary replicas)

Key Non-Functional Requirements

Redirect Low Latency

Since users reside in same region we expect network is good enough.

Since main use case for our system is reading - low latency is achieved with cache-aside approach in API server. All active URLs should stay in app level cache.

High availability

Downtime of one API server instance is not a problem due to Load Balancer configuration - it should just route requests to alive instances.

Downtime of specific database nodes also should not be a problem since we set up replication and re-election of primary.

Scalability

We can increase number of API servers that will increase total cache capacity as well as number of simultaneous requests that system can process.

Adding more database shards should be possible if we use techniques like consistent hash in sharding, then only a small part of data should be migrated.

Detailed Component Design

# Use Cases

## Create

API Service receives long_url.

If a record for this long_url is found in database it is returned in response.

If no record found server retrieves any free record. To eliminate race it should be performed in atomic database operation.

If no free record is found API service requests Generator service for new record. This should not happen often so using single instance of Generator seems reasonable.

## Read

API Server receives short_url. Cache-aside:

It checks LRU cache.

If record is found then redirect happens instantly. We should tune cache to achieve desired low latency.
If record is not found API server queries the database (for performance and to avoid overloading primaries it may query replicas). In this case latency is higher, but it should happen only during API server cache warm up.

Either "Found", "Not found", "Internal error" response is sent.

# Database.

Record format: short_url, long_url, status, last_accessed

Sharding should be configured by prefix of short_url field.

Index [short_url_prefix, short_url] allows sharded and fast queries.