1. Requirements

Functional Requirements

  • Users can:
    • Create paste (text content)
    • Get unique URL for paste
    • Retrieve paste via URL
  • Optional:
    • Expiration time (TTL)
    • Public/private pastes
    • Edit/delete pastes

Non-Functional Requirements

  • Read-heavy system
  • Latency: < 100ms for retrieval
  • Scalability: Billions of pastes
  • Availability: High (reads should not fail)
  • Durability: Pastes must not be lost


2. Estimations

Traffic

  • Writes: ~10K/sec
  • Reads: ~500K–1M/sec

Storage

Assume:

  • Avg paste size = 5KB
  • 1B pastes → ~5TB

👉 Requires distributed storage

Read/Write Pattern

  • Read-heavy → heavy caching needed

3. API Design

3.1 Create Paste

POST /paste

Request:

{ "content": "text...", "expiry": "optional", "visibility": "public/private" }

Response:

{ "url": "https://pastebin.com/abc123" }

3.2 Get Paste

GET /{paste_id}

3.3 Delete Paste

DELETE /{paste_id}



4. Data Storage & Design

4.1 Metadata Store

Use:

  • Cassandra / DynamoDB

Schema:

paste_id (PK) created_at expiry visibility user_id

4.2 Content Storage (Important Separation)

Use:

  • Object storage (e.g., Amazon S3)

Key:

paste_id → content blob

👉 Reason:

  • Large content not suitable for DB
  • Cheap, scalable storage

4.3 Cache

Use:

  • Redis

Key:

paste_id → content




5. High-Level Architecture (HLD)

System consists of:

  • Entry Layer
  • Read Path
  • Write Path

5.1 Entry Layer

  • CDN + Load Balancer

Responsibilities:

  • Handle read spikes
  • Serve cached pastes
  • Protect backend

5.2 Read Path

  1. Client requests paste
  2. CDN:
    • Hit → return content
    • Miss → forward
  3. Load Balancer → App Server
  4. App Server:
    • Check Redis
    • If miss → fetch metadata + content
  5. Return paste

5.3 Write Path

  1. Client submits paste
  2. App server:
    • Generate paste_id
    • Store metadata in DB
    • Store content in object storage
  3. Cache metadata/content




6. Detailed Breakdown

6.1 Paste ID Generation

Approach: Distributed ID + Base62

Use Snowflake-style:

[timestamp | machine_id | sequence]

Convert to Base62 → short ID

6.2 Read Optimization

  • CDN caches popular pastes
  • Redis caches hot data
  • Most reads avoid DB

6.3 Write Optimization

  • Async writes to object storage
  • Metadata stored separately

6.4 Stateless Scaling

  • App servers stateless
  • Scale horizontally

6.5 Hot Key Handling

Problem:

  • Viral paste → heavy reads

Solution:

  • CDN absorbs traffic
  • Redis replication
  • Local cache on app servers

6.6 Idempotent Paste Creation

Problem:

  • Client retries (network failure, timeout)
  • Same paste may be created multiple times → duplicates

Solution: Idempotency Key

Approach:

Client sends:

Idempotency-Key: <unique_key>

Flow:

  1. On paste creation request:
    • Check Redis:
idempotency:{key} → paste_id
  1. If exists:
    • Return existing paste URL (no duplicate)
  2. If not:
    • Create paste
    • Store mapping:
idempotency:{key} → paste_id (with TTL)

TTL:

  • Store key for ~24 hours

Benefits:

  • Prevents duplicate pastes
  • Safe retries
  • Ensures write correctness


7. Additional Considerations

7.1 Large Paste Handling

  • Store large pastes in chunks (optional)
  • Stream content instead of loading fully

7.2 Expiration & TTL

  • Store expiry in DB
  • Object storage lifecycle rules auto-delete

7.3 Cache Eviction

  • Redis uses LRU/LFU

7.4 Degraded Mode

Redis Down:

  • Fetch from DB + object storage

DB Down:

  • Serve cached pastes

Object Storage Down:

  • Serve cached data if available

7.5 Security

  • Rate limiting (prevent spam)
  • Content moderation
  • Private pastes require auth


7.6 Expiry & Automatic Cleanup (End-to-End Flow)

Goal:

Ensure expired pastes are:

  • Not served
  • Automatically deleted
  • Do not cause storage bloat

Cleanup Pipeline

Step 1: Expiry Defined at Creation

  • Each paste has:
expiry_timestamp

Step 2: Read-Time Enforcement (Immediate Consistency)

On every read:

  • App server checks expiry
  • If expired:
    • Return 404
    • Trigger async cleanup event

👉 Ensures expired data is never served

Step 3: Storage-Level Cleanup (Primary Deletion)

Using Amazon S3:

  • Configure lifecycle rule:
    • Auto-delete objects after expiry

👉 Guarantees cleanup even if app layer fails

Step 4: Metadata Cleanup (DB)

Use TTL (preferred):

  • In Cassandra:
    • Set TTL on records
    • Automatic deletion after expiry

👉 No manual jobs required

Step 5: Background Reconciliation Job (Safety Net)

  • Periodic worker:
    • Scans for orphan records
    • Deletes inconsistencies

Step 6: Cache Cleanup

  • Redis uses TTL:
paste_id → content (TTL)

Why This Works

  • Multi-layer cleanup (read + storage + DB)
  • No single point of failure
  • Fully automated lifecycle


7.7 Rate Limiting for Paste Creation (Abuse Prevention)

Goal:

Prevent:

  • Spam paste creation
  • Resource exhaustion

Rate Limiting Flow

Step 1: Request Hits API Gateway

  • Apply rate limit using:
    • Token Bucket algorithm

Step 2: Check Limit in Redis

Key:

rate_limit:{user/ip}

Step 3: Decision

  • If within limit:
    • Forward request to app server
  • Else:
    • Reject with:
429 Too Many Requests

Step 4: Burst Handling

  • Token bucket allows short bursts
  • Prevents sudden spikes

Step 5: Adaptive Limits

  • Anonymous users → stricter limits
  • Authenticated users → relaxed limits

Why Redis?

  • In-memory → low latency
  • Atomic ops → accurate counters
  • Scales horizontally

Integration Point

  • Implemented at API Gateway layer
  • Ensures:
    • Bad traffic never reaches backend


WHY Object Storage

Object storage is chosen for content because it provides cheap, scalable storage for large blobs and supports lifecycle policies for automatic cleanup.

WHY Redis

Redis is used for caching due to its low latency and ability to handle high read throughput.

WHY Cassandra / DynamoDB

A distributed NoSQL database is used to support horizontal scaling and high write throughput without a single point of failure.




🏁 Final Summary

  • Separation of metadata & content improves scalability
  • CDN + Redis caching ensures low latency
  • Object storage handles large data efficiently
  • Stateless services ensure horizontal scaling
  • TTL + lifecycle policies prevent storage bloat