Design Pastebin - System Design

Requirements

Functional Requirements:

Allow users to upload and store text or code snippets.
Generate a unique shareable URL for each paste.
Enable retrieval of paste content by URL.
Support expiration and TTL for pastes.
Allow paste owners or the system to delete a paste before its natural expiration.

Non-Functional Requirements:

Availability: The service must be highly available; users should always be able to read existing pastes.

Performance: Read operations must have extremely low latency.
Scalability: Must handle a high read-to-write ratio (e.g., 100:1) and massive amounts of concurrent requests.
Reliability: Uploaded pastes must not be lost before their expiration time.
Resilience & Fault Tolerance: The system must continue to operate smoothly even if individual nodes or datacenters fail.

API Design

To ensure robust data handling and address critical operational edge cases, the RESTful APIs are structured as follows:

1.CreatePaste(api_dev_key, paste_content, expiration_date, idempotency_key)

Edge Case Handled (Rate limit & Idempotent case): The API Gateway enforces a Token Bucket rate limit specifically on this creation endpoint (tracked per IP or API key) to strictly prevent spam, abuse, and malicious bulk write attacks. Additionally, the idempotency_key ensures that if a client retry occurs due to network timeouts, the server does not generate a duplicate paste.
2.GetPaste(api_paste_key)
Edge Case Handled (Rate limit case): Returns the raw paste text. Read operations also enforce rate limiting at the API Gateway level to prevent targeted scraping.
3.DeletePaste(api_dev_key, api_paste_key)
Executes early manual deletion from storage and invalidates the cache immediately.

High-Level Design

The architecture relies on decoupled microservices to optimize both the read and write paths:

1.API_Gateway: Centralized entry point handling SSL termination, rate limiting, and request routing.

2.App_Servers: Horizontally scaling, stateless instances executing the core paste business logic.

3.Key_Generation_Service(KGS): A standalone, isolated service continuously pre-generating unique 7-character Base62 strings. It stores them in a localized KGS database, ensuring URL assignments operate at O(1) time complexity without runtime collision checks.

4.Caching_Layer: A distributed Redis or Memcached cluster sitting in front of the database to absorb the 99% read-heavy traffic.

5.Metadata_Database: A NoSQL datastore (e.g., DynamoDB or Cassandra) housing paste metadata (URL hash, TTL, author, and storage pointers).

6.Object_Storage: Cloud-native blob storage (e.g., Amazon S3) containing the raw text or code snippets.

Detailed Component Design

To satisfy our stringent non-functional metrics and safeguard against extreme operational states, the components are engineered as follows:

1.Performance_&_Scalability:

We implement a global CDN to push popular, static pastes directly to the network edge, drastically reducing origin server load.
Utilizing the asynchronous KGS guarantees that write latency is minimized since the system simply pops an available key from memory rather than mathematically calculating hashes concurrently.
2.Reliability_&_Fault_Tolerance:
The NoSQL database is replicated across at least three geographic availability zones.
Application servers sit behind an active-passive Load Balancer configuration with auto-scaling policies attached to CPU utilization metrics.
3.Edge_Case_Handling(Thundering_Herd_Scenario):
The Threat: When a highly viral paste's cache TTL inevitably expires, millions of concurrent read requests can bypass the empty cache and strike the NoSQL database at the exact same millisecond, triggering a complete database crash (Cache Stampede).
The Solution: We deploy a Mutex Lock (Cache Lock). Upon a cache miss, only the first requested thread is permitted to query the NoSQL database. All subsequent parallel threads are temporarily blocked and forced to wait a few milliseconds. Once the first thread repopulates the Redis cache, the lock is released, and all waiting threads fetch safely from the newly warmed cache.
4.Edge_Case_Handling(Resilient_TTL_Cleanup):
Expired pastes are not deleted synchronously during user requests, which would stall performance.
A background worker (Cron service) asynchronously scans the NoSQL database for expired TTL markers, purges the metadata, and issues lifecycle deletion flags to the Object Storage blobs.