1. Requirements

Functional Requirements

Users can:
- Create paste (text content)
- Get unique URL for paste
- Retrieve paste via URL
Optional:
- Expiration time (TTL)
- Public/private pastes
- Edit/delete pastes

Non-Functional Requirements

Read-heavy system
Latency: < 100ms for retrieval
Scalability: Billions of pastes
Availability: High (reads should not fail)
Durability: Pastes must not be lost

2. Estimations

Traffic

Writes: ~10K/sec
Reads: ~500K–1M/sec

Storage

Assume:

Avg paste size = 5KB
1B pastes → ~5TB

👉 Requires distributed storage

Read/Write Pattern

Read-heavy → heavy caching needed

3. API Design

3.1 Create Paste

POST /paste

Request:

{
"content": "text...",
"expiry": "optional",
"visibility": "public/private"
}

Response:

{
"url": "https://pastebin.com/abc123"
}

3.2 Get Paste

GET /{paste_id}

3.3 Delete Paste

DELETE /{paste_id}

4. Data Storage & Design

4.1 Metadata Store

Use:

Cassandra / DynamoDB

Schema:

paste_id (PK)
created_at
expiry
visibility
user_id

4.2 Content Storage (Important Separation)

Use:

Object storage (e.g., Amazon S3)

Key:

paste_id → content blob

👉 Reason:

Large content not suitable for DB
Cheap, scalable storage

4.3 Cache

Use:

Redis

Key:

paste_id → content

5. High-Level Architecture (HLD)

System consists of:

Entry Layer
Read Path
Write Path

5.1 Entry Layer

CDN + Load Balancer

Responsibilities:

Handle read spikes
Serve cached pastes
Protect backend

5.2 Read Path

Client requests paste
CDN:
- Hit → return content
- Miss → forward
Load Balancer → App Server
App Server:
- Check Redis
- If miss → fetch metadata + content
Return paste

5.3 Write Path

Client submits paste
App server:
- Generate paste_id
- Store metadata in DB
- Store content in object storage
Cache metadata/content

6. Detailed Breakdown

6.1 Paste ID Generation

Approach: Distributed ID + Base62

Use Snowflake-style:

[timestamp | machine_id | sequence]

Convert to Base62 → short ID

6.2 Read Optimization

CDN caches popular pastes
Redis caches hot data
Most reads avoid DB

6.3 Write Optimization

Async writes to object storage
Metadata stored separately

6.4 Stateless Scaling

App servers stateless
Scale horizontally

6.5 Hot Key Handling

Problem:

Viral paste → heavy reads

Solution:

CDN absorbs traffic
Redis replication
Local cache on app servers

6.6 Idempotent Paste Creation

Problem:

Client retries (network failure, timeout)
Same paste may be created multiple times → duplicates

Solution: Idempotency Key

Approach:

Client sends:

Idempotency-Key: <unique_key>

Flow:

On paste creation request:
- Check Redis:

idempotency:{key} → paste_id

If exists:
- Return existing paste URL (no duplicate)
If not:
- Create paste
- Store mapping:

idempotency:{key} → paste_id (with TTL)

TTL:

Store key for ~24 hours

Benefits:

Prevents duplicate pastes
Safe retries
Ensures write correctness

7. Additional Considerations

7.1 Large Paste Handling

Store large pastes in chunks (optional)
Stream content instead of loading fully

7.2 Expiration & TTL

Store expiry in DB
Object storage lifecycle rules auto-delete

7.3 Cache Eviction

Redis uses LRU/LFU

7.4 Degraded Mode

Redis Down:

Fetch from DB + object storage

DB Down:

Serve cached pastes

Object Storage Down:

Serve cached data if available

7.5 Security

Rate limiting (prevent spam)
Content moderation
Private pastes require auth

7.6 Expiry & Automatic Cleanup (End-to-End Flow)

Goal:

Ensure expired pastes are:

Not served
Automatically deleted
Do not cause storage bloat

Cleanup Pipeline

Step 1: Expiry Defined at Creation

Each paste has:

expiry_timestamp

Step 2: Read-Time Enforcement (Immediate Consistency)

On every read:

App server checks expiry
If expired:
- Return 404
- Trigger async cleanup event

👉 Ensures expired data is never served

Step 3: Storage-Level Cleanup (Primary Deletion)

Using Amazon S3:

Configure lifecycle rule:
- Auto-delete objects after expiry

👉 Guarantees cleanup even if app layer fails

Step 4: Metadata Cleanup (DB)

Use TTL (preferred):

In Cassandra:
- Set TTL on records
- Automatic deletion after expiry

👉 No manual jobs required

Step 5: Background Reconciliation Job (Safety Net)

Periodic worker:
- Scans for orphan records
- Deletes inconsistencies

Step 6: Cache Cleanup

Redis uses TTL:

paste_id → content (TTL)

Why This Works

Multi-layer cleanup (read + storage + DB)
No single point of failure
Fully automated lifecycle

7.7 Rate Limiting for Paste Creation (Abuse Prevention)

Goal:

Prevent:

Spam paste creation
Resource exhaustion

Rate Limiting Flow

Step 1: Request Hits API Gateway

Apply rate limit using:
- Token Bucket algorithm

Step 2: Check Limit in Redis

Key:

rate_limit:{user/ip}

Step 3: Decision

If within limit:
- Forward request to app server
Else:
- Reject with:

429 Too Many Requests

Step 4: Burst Handling

Token bucket allows short bursts
Prevents sudden spikes

Step 5: Adaptive Limits

Anonymous users → stricter limits
Authenticated users → relaxed limits

Why Redis?

In-memory → low latency
Atomic ops → accurate counters
Scales horizontally

Integration Point

Implemented at API Gateway layer
Ensures:
- Bad traffic never reaches backend

WHY Object Storage

Object storage is chosen for content because it provides cheap, scalable storage for large blobs and supports lifecycle policies for automatic cleanup.

WHY Redis

Redis is used for caching due to its low latency and ability to handle high read throughput.

WHY Cassandra / DynamoDB

A distributed NoSQL database is used to support horizontal scaling and high write throughput without a single point of failure.

🏁 Final Summary

Separation of metadata & content improves scalability
CDN + Redis caching ensures low latency
Object storage handles large data efficiently
Stateless services ensure horizontal scaling
TTL + lifecycle policies prevent storage bloat