Design Pastebin - System Design

Requirements

Allow users to upload and store text or code snippets.
Generate a unique shareable URL for each paste.
Enable retrieval of paste content by URL.
Support expiration and TTL for pastes.
Allow paste owners or the system to delete a paste before its natural expiration.

Availability: The service must be highly available; users should always be able to read existing pastes.

Performance: Read operations must have extremely low latency.
Scalability: Must handle a high read-to-write ratio (e.g., 100:1) and massive amounts of concurrent requests.
Reliability: Uploaded pastes must not be lost before their expiration time.
Resilience & Fault Tolerance: The system must continue to operate smoothly even if individual nodes or datacenters fail.

We expose a simple RESTful API for our read and write paths.

- POST /api/v1/pastePayload: { "text": "string", "expiration_minutes": int, "custom_alias": "string" (optional) }
- Response: 201 Created with { "url": "https://paste.bin/xyz123" }
- GET /api/v1/paste/{url_hash}Response: 200 OK with { "text": "string", "created_at": timestamp, "expires_at": timestamp }
- DELETE /api/v1/paste/{url_hash}Response: 200 OK or 204 No Content

The system is divided into a read path and a write path, balanced by an API Gateway.

Write Path: When a user submits text, the application server requests a unique hash from the Key Generation Service (KGS). It then stores the hash alongside the text in the primary Database and Object Storage.
Read Path: When a user requests a URL, the application server first queries the Cache. If a cache miss occurs, it queries the Database, updates the Cache, and returns the content to the user.
Cleanup: A background worker periodically sweeps the database for expired TTLs and purges the records to free up space.

To meet our strict system requirements, the architecture utilizes the following components and strategies:

Function: Pre-generates random 6-base62 character strings and stores them in a dedicated KGS database.
Performance: By pre-generating keys, we eliminate the latency of checking for collisions during a write request.
Scalability: KGS can be distributed. Application servers can cache a chunk of keys locally to avoid network calls for every single paste.

Metadata DB (NoSQL): We use a NoSQL database like Cassandra or DynamoDB to store the metadata (hash, expiration, user_id).
- Scalability: NoSQL scales horizontally perfectly for massive key-value lookups.
- Reliability & Fault Tolerance: Data is replicated across multiple nodes and Availability Zones. If one node goes down, another serves the data.
Object Storage (S3): If pastes exceed a certain size (e.g., > 10KB), the raw text is pushed to Object Storage, and only the S3 link is stored in the NoSQL DB. This keeps the database lean and performant.

Function: A Redis or Memcached cluster sits in front of the database.
Performance: Caching the most frequently accessed pastes (using an LRU eviction policy) ensures sub-millisecond read latency.
Resilience: If the cache goes down, the system falls back to the NoSQL database. While latency might spike, the system remains available.

Function: A distributed cron job that scans for expired pastes and deletes them.
Resilience: Instead of deleting synchronously (which slows down reads), we use lazy deletion. If a user requests an expired paste, the server checks the TTL, returns a 404, and deletes it asynchronously.