Requirements


Functional Requirements:


  • Allow users to upload and store text or code snippets.
  • Generate a unique shareable URL for each paste.
  • Enable retrieval of paste content by URL.
  • Support expiration and TTL for pastes.
  • Allow paste owners or the system to delete a paste before its natural expiration.



Non-Functional Requirements:


Availability: The service must be highly available; users should always be able to read existing pastes.

  • Performance: Read operations must have extremely low latency.
  • Scalability: Must handle a high read-to-write ratio (e.g., 100:1) and massive amounts of concurrent requests.
  • Reliability: Uploaded pastes must not be lost before their expiration time.
  • Resilience & Fault Tolerance: The system must continue to operate smoothly even if individual nodes or datacenters fail.


API Design

We expose a simple RESTful API for our read and write paths.

    • POST /api/v1/pastePayload: { "text": "string", "expiration_minutes": int, "custom_alias": "string" (optional) }
    • Response: 201 Created with { "url": "https://paste.bin/xyz123" }
    • GET /api/v1/paste/{url_hash}Response: 200 OK with { "text": "string", "created_at": timestamp, "expires_at": timestamp }
    • DELETE /api/v1/paste/{url_hash}Response: 200 OK or 204 No Content



High-Level Design

The architecture relies on decoupled microservices to optimize both the read and write paths:

1.API_Gateway: Centralized entry point handling SSL termination, rate limiting, and request routing.

2.App_Servers: Horizontally scaling, stateless instances executing the core paste business logic.

3.Key_Generation_Service(KGS): A standalone, isolated service continuously pre-generating unique 7-character Base62 strings. It stores them in a localized KGS database, ensuring URL assignments operate at O(1) time complexity without runtime collision checks.

4.Caching_Layer: A distributed Redis or Memcached cluster sitting in front of the database to absorb the 99% read-heavy traffic.

5.Metadata_Database: A NoSQL datastore (e.g., DynamoDB or Cassandra) housing paste metadata (URL hash, TTL, author, and storage pointers).

  • 6.Object_Storage: Cloud-native blob storage (e.g., Amazon S3) containing the raw text or code snippets.






Detailed Component Design

To satisfy our stringent non-functional metrics and safeguard against extreme operational states, the components are engineered as follows:

1.Performance_&_Scalability:

  • We implement a global CDN to push popular, static pastes directly to the network edge, drastically reducing origin server load.
  • Utilizing the asynchronous KGS guarantees that write latency is minimized since the system simply pops an available key from memory rather than mathematically calculating hashes concurrently.
  • 2.Reliability_&_Fault_Tolerance:
  • The NoSQL database is replicated across at least three geographic availability zones.
  • Application servers sit behind an active-passive Load Balancer configuration with auto-scaling policies attached to CPU utilization metrics.
  • 3.Edge_Case_Handling(Thundering_Herd_Scenario):
  • The Threat: When a highly viral paste's cache TTL inevitably expires, millions of concurrent read requests can bypass the empty cache and strike the NoSQL database at the exact same millisecond, triggering a complete database crash (Cache Stampede).
  • The Solution: We deploy a Mutex Lock (Cache Lock). Upon a cache miss, only the first requested thread is permitted to query the NoSQL database. All subsequent parallel threads are temporarily blocked and forced to wait a few milliseconds. Once the first thread repopulates the Redis cache, the lock is released, and all waiting threads fetch safely from the newly warmed cache.
  • 4.Edge_Case_Handling(Resilient_TTL_Cleanup):
  • Expired pastes are not deleted synchronously during user requests, which would stall performance.
  • A background worker (Cron service) asynchronously scans the NoSQL database for expired TTL markers, purges the metadata, and issues lifecycle deletion flags to the Object Storage blobs.