1. Requirements

Functional Requirements

Users can:
- Create paste (text content)
- Get unique URL for paste
- Retrieve paste via URL
Optional:
- Expiration time (TTL)
- Public/private pastes
- Edit/delete pastes

Non-Functional Requirements

Read-heavy system
Latency: < 100ms for retrieval
Scalability: Billions of pastes
Availability: High (reads should not fail)
Durability: Pastes must not be lost

2. Estimations

Traffic

Writes: ~10K/sec
Reads: ~500K–1M/sec

Storage

Assume:

Avg paste size = 5KB
1B pastes → ~5TB

👉 Requires distributed storage

Read/Write Pattern

Read-heavy → heavy caching needed

3. API Design

3.1 Create Paste

POST /paste

Request:

{
"content": "text...",
"expiry": "optional",
"visibility": "public/private"
}

Response:

{
"url": "https://pastebin.com/abc123"
}

3.2 Get Paste

GET /{paste_id}

3.3 Delete Paste

DELETE /{paste_id}

4. Data Storage & Design

4.1 Metadata Store

Use:

Cassandra / DynamoDB

Schema:

paste_id (PK)
created_at
expiry
visibility
user_id

4.2 Content Storage (Important Separation)

Use:

Object storage (e.g., Amazon S3)

Key:

paste_id → content blob

👉 Reason:

Large content not suitable for DB
Cheap, scalable storage

4.3 Cache

Use:

Redis

Key:

paste_id → content

5. High-Level Architecture (HLD)

System consists of:

Entry Layer
Read Path
Write Path

5.1 Entry Layer

CDN + Load Balancer

Responsibilities:

Handle read spikes
Serve cached pastes
Protect backend

5.2 Read Path

Client requests paste
CDN:
- Hit → return content
- Miss → forward
Load Balancer → App Server
App Server:
- Check Redis
- If miss → fetch metadata + content
Return paste

5.3 Write Path

Client submits paste
App server:
- Generate paste_id
- Store metadata in DB
- Store content in object storage
Cache metadata/content

6. Detailed Breakdown

6.1 Paste ID Generation

Approach: Distributed ID + Base62

Use Snowflake-style:

[timestamp | machine_id | sequence]

Convert to Base62 → short ID

6.2 Read Optimization

CDN caches popular pastes
Redis caches hot data
Most reads avoid DB

6.3 Write Optimization

Async writes to object storage
Metadata stored separately

6.4 Stateless Scaling

App servers stateless
Scale horizontally

6.5 Hot Key Handling

Problem:

Viral paste → heavy reads

Solution:

CDN absorbs traffic
Redis replication
Local cache on app servers

6.6 Idempotent Paste Creation

Problem:

Client retries (network failure, timeout)
Same paste may be created multiple times → duplicates

Solution: Idempotency Key

Approach:

Client sends:

Idempotency-Key: <unique_key>

Flow:

On paste creation request:
- Check Redis:

idempotency:{key} → paste_id

If exists:
- Return existing paste URL (no duplicate)
If not:
- Create paste
- Store mapping:

idempotency:{key} → paste_id (with TTL)

TTL:

Store key for ~24 hours

Benefits:

Prevents duplicate pastes
Safe retries
Ensures write correctness

7. Additional Considerations

7.1 Large Paste Handling

Store large pastes in chunks (optional)
Stream content instead of loading fully

7.2 Expiration & TTL

Store expiry in DB
Object storage lifecycle rules auto-delete

7.3 Cache Eviction

Redis uses LRU/LFU

7.4 Degraded Mode

Redis Down:

Fetch from DB + object storage

DB Down:

Serve cached pastes

Object Storage Down:

Serve cached data if available

7.5 Security

Rate limiting (prevent spam)
Content moderation
Private pastes require auth

7.6 Expiry & Automatic Cleanup

1. Expiry at Creation

Each paste has optional expiry_timestamp

2. Object Storage Lifecycle Rules

Using Amazon S3:

Configure lifecycle policy:
- Auto-delete objects after expiry

3. Metadata Cleanup (DB)

Two approaches:

Option A: TTL-Based Expiry (Preferred)

Use DB TTL (e.g., Cassandra TTL)
Records auto-expire without manual cleanup

Option B: Background Cleanup Job

Periodic worker scans expired records
Deletes:
- Metadata from DB
- Content from storage

4. Lazy Deletion (Safety Layer)

On read:
- If expired → return 404
- Trigger async cleanup

5. Cache Cleanup

Redis uses TTL for automatic eviction

👉 Ensures:

No storage bloat
Fully automated cleanup

7.7 Rate Limiting & Abuse Prevention

Problem:

Malicious users can:
- Spam paste creation
- Exhaust storage/resources

Solution: Multi-Layer Rate Limiting

1. API Gateway Rate Limiting

Use:

Token Bucket / Sliding Window

Apply limits:

Per IP: X requests/min
Per User: Y requests/min

2. Distributed Rate Limiter

Use:

Redis

Key:

rate_limit:{user/ip}

3. Adaptive Limits

New users → stricter limits
Trusted users → relaxed limits

4. Burst Handling

Token bucket allows short bursts
Prevents sudden abuse spikes

5. Abuse Detection

Detect:
- High frequency paste creation
- Duplicate content spam

6. Response

Reject with:

429 Too Many Requests

👉 Protects system from:

Spam
Resource exhaustion

🏁 Final Summary

Separation of metadata & content improves scalability
CDN + Redis caching ensures low latency
Object storage handles large data efficiently
Stateless services ensure horizontal scaling
TTL + lifecycle policies prevent storage bloat