Designing A Simple Url Shortening Service A TinyURL Approach - System Design

My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach

by flare3451

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.

Non-Functional Requirements:

Scalability: horizontal scaling
- Load
  - 500 million short url per day = 6000 write rps
  - 1 billion long url per day = 12000 rps
- Storage
  - Retention 1 week
  - 100 bytes per url
  - 350 GB
Availability - highly available - 99.99 uptime
Consistency - NA since we have immutable data.
Reliable - Durable data, cannot afford losing data
Latency - low latency for good user experience. < 100 ms

API Design

APIs

create short url
- POST /shorten {longUrl: xyz.com} returns {shortUrl: xyz}
redirect
- GET /{shortUrl} returns HTTP 1.1 302 Found Location: xyz.com

Database

MongoDb
- shortUrl, longUrl, createdDate
SQL to keep 3.5 trillion unique short urls

High-Level Design

Shorten:

shorten requests from the clients go through Edge Servers so that they cache the response locally
the request goes through load balancer for load distribution and rate limiting and land on Shortening Service
Shortening service gets unused short url form the sql, saves it to Database and responds to the client
It asynchronously caches the short url.
It asynchronously sends data to analytics.

Redirect:

redirect requests are handled by edge servers on a cache hit
else it forwards the request to a redirect service
redirect service fetches the response from a cache, if missed gets it from db.
asynchronously sends data to analytics

Short url generation:

with 7 words length we can have 3.5 Trillion unique short urls
we will store them in a SQL store with a boolean to indicate if the url is used or not
retrieve unused url and assign it to the long url

Detailed Component Design

Low latency: geographically distributed Edge Servers that also act as a read through cache
Rate Limiting: Use Token Bucket Algorithm with UserId as the key identifier. This handles burst traffic.
- Example configuration - 5 tokens per user with Retry After one minute.
Load Distribution: Distribute based on least connections active.
Auto Scaling: Application servers configured to auto scale (scale up when >60% cpu, scale down when <30% cpu)
HotSpot mitigation:
- data skew: mitigated by using high cardinality key like shortUrl
- access skew: mitigated by using cache with LFU eviction
ID generation:
- collision: avoided by using a SQL database which consists of all unique urls using 7 alphanumeric letters. All access to this data is transactional which avoids ID collisions. We can have a one master multiple replica setup where writes go through master. When master goes down one of the replicas will be elected as the master based on quorum mechanism to avoid any split brain scenarios.
- predictable: shortUrls are not predictable since they are randomly assigned from a pool of unique urls.
Caching:
- Invalidation: When a long url is updated or deleted, we refresh that data from the cache.
- Eviction: LFU with TTL
- Cache Miss scenario: Use a cache aside strategy to write to cache. On cache miss, application reads the data from DB and updates the cache.
- Hot Keys: In order to handle popular short links, we can follow below strategies
  - Edge Servers: will be the first line of defence for handling hot keys.
  - Local Caching: The redirect service will have a scheduler to get most popular short links and hold it in their RAM.
  - Request Coalescing: Allow one request to rebuild cache, others wait

Markdown supported