Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.
Support custom URL

Non-Functional Requirements:

High availability
Low redirect latency
Horizontal scalability

API Design

Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...

1) Endpoints

URL shortening endpoint
URL redirect endpoint
custom URL endpoint

2) HTTP methods

POST for creating a short URL
GET for retrieving the original URL from a short URL

3) Request and Response Formats

Expected input JSON for creating a short URL
Output format that includes shortened URL

4) Error Handling

5) Authentication and Security

High-Level Design

Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.

We need microservices where each service (Like URL creation and redirection) operates independent. To ensure unique identifiers to avoid collisions the service can create a hash. We need an API gateway to manage requests. To improve performance and reduce latency we can have a dedicated service for URL shortening that interacts with a caching layer. If a user requests that same URL we can respond back with the cache and have a TTL of 24 hours so that it doesn't become stale. In order to handle data storage, we can use a NoSQL database since we don't need a relational database and we can scale horizontal if needed.

Detailed Component Design

Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.

Shortner and Redirect Service
- The shortner service shortens the URL via a hash function and when done stores it into the database. It makes IDs harder to guess since hash functions help insure uniqueness through randomness.
- When a link is updated or disabled we update the status in the database as well as make sure it is updated in the cache
- When we need to recall the longer URL from the shortened URL we can just look in the database mapping
- Rate limiting can help keep the server from overloading
- We can load balance to prevent overload on any single shard
- For the fallback mechanism during generator failures, you can design a distributed ID generation strategy using a combination of a central generator and a local fallback mechanism. For example, if the central generator is unavailable, each node can generate IDs using a local algorithm that combines a unique node identifier with a timestamp or a random component. This ensures that even in the event of a central generator failure, the system can continue to create unique IDs without duplicates.
- Regarding cache invalidation, specify a strategy for purging or updating the cache when links are modified or disabled. For instance, you could implement a publish-subscribe model where the shortener service publishes an event when a link is updated or deleted, and the caching layer subscribes to these events to invalidate or refresh the relevant cache entries. This ensures that users always receive the most accurate data.
- For hotspot mitigation, consider implementing sharding based on the hash of the short URL. This way, even if certain IDs become extremely popular, the load can be distributed across multiple shards. Additionally, you could use a caching layer to store frequently accessed links, which would help alleviate pressure on the database during traffic spikes.
- Finally, for rate limiting, you can enhance your design by specifying a token bucket or sliding window algorithm. These algorithms can help manage burst traffic more effectively by allowing short bursts of requests while still enforcing an overall limit. Additionally, consider implementing a Retry-After header in your responses to inform clients when they should retry their requests after being rate-limited.

To enhance your design regarding cache management, consider specifying an eviction policy that balances memory usage and data freshness effectively. One common approach is to implement a Least Recently Used (LRU) eviction policy. This policy removes the least recently accessed entries from the cache when it reaches its memory limit, ensuring that frequently accessed data remains available while older, less relevant data is purged.

You can also define a maximum cache size in terms of memory or entry count, which would trigger the eviction process. This way, you can maintain a balance between memory consumption and the freshness of the data. Additionally, consider implementing a hybrid approach where entries that are nearing their TTL are prioritized for eviction, ensuring that stale data is removed before it expires.

NoSQL Database
- Unlike a SQL Database we only need a one to one mapping
- We can scale horizontally if needed, something that SQL Databases struggle with