Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Long url --> short url
Short url --> long url
optional expiration time
track data analytics

Non-Functional Requirements:

Low latency. this service is simple, 200ms is more than enough
High scalability, easily scalable
High availability, service intact even when some nodes fail
Data durability

API Design

Create(longUrl string) string which accepts the long url and returns a short url.
FindOriginal(shortUrl string) string which accepts the short url and returns the long url.

High-Level Design

Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.

Components needed:

CDN
Load balancer
api gateway

Services:

Shortening service (long to short)
Finding service (short to long)
ID generation service

Storage:

RDBMS
cache
analytics

Detailed Component Design

Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.

load balancer can also be scaled vertically by adding more hardware or horizontally by adding more nodes. load balancers need to access the same server list, it can be an etcd or some databases. it also help to do health checks. many algorithms can be used here, round robin or least connections. round robin works well in simple scenarios but it doesn't know if the nodes are actually busy and some requests apparently take longer time than the others. least connections send requests to the least nodes having the least connections, but also connections count doesn't mean if nodes are overloaded. maybe weighted least connections are fine too.

Database can also be scaled horizontally or vertically. vertically by adding more hardware, horitonzally by adding more read write replicas. multiple nodes support high availability also. it works with cache that caches data. this business is simple so we could simply cache the data and set an expiration time.

ID generation is important here, uniqueness and safety are crucial. we could just generatae a random string using base64 with a length of 10. 64 to the power of 10 is extremely large. If there is a collision, we just retry. It is extremely fast and most importantly, it doesn't relate to the original long url, so it's hard to find relation and very secure.