Requirements
Functional Requirements:
- Users can create a paste containing plain text or code.
- Each paste has:
- A unique short URL (e.g.,
paste.io/abc123) - Optional expiration time (e.g., 10 mins, 1 hour, never)
- Optional visibility:
public,unlisted,private
- A unique short URL (e.g.,
- Users can retrieve a paste via its short URL.
- Optional: Syntax highlighting for code pastes.
Non-Functional Requirements:
- High Availability (99.9% uptime)
- Low Latency for paste retrieval (<100ms P95)
- Scalability to support millions of pastes/day
- Data Expiry support (auto-deletion via TTL)
- Abuse Protection (spam, link flooding)
- Security (unlisted/private visibility, DoS protection)
API Design
- POST /paste
- Create a new paste.
Request
{
"content": "print('Hello')",
"expires_in": 3600,
"syntax": "python",
"visibility": "unlisted"
}
Response
{
"paste_id": "abc123",
"url": "https://paste.io/abc123",
"expires_at": "2025-11-30T10:00:00Z"
}
- GET /paste/{paste_id}
- Retrieve a paste by ID.
Response:
{
"content": "print('Hello')",
"syntax": "python",
"created_at": "2025-11-30T09:00:00Z",
"expires_at": "2025-11-30T10:00:00Z"
}
High-Level Architecture
[Client]
↓
[API Gateway / Load Balancer]
↓
[Paste Service (Create / Retrieve)]
↓ ↘
[Redis Cache] [Database]
↑ ↓
[TTL Cleaner Job / Expiry Processor]
Data Model
Table: pastes
- paste_id (PK, string)
- content (text)
- created_at (timestamp)
- expires_at (timestamp)
- syntax (string)
- visibility (enum: public, unlisted, private)
- owner_id (nullable, for logged-in users)
Indexes:
- paste_id (for lookup)
- expires_at (for TTL cleanup)
- owner_id (for user-based queries)
Detailed Component Design
Paste ID Generation
- Use Base62-encoded UUID or NanoID to generate short, unique IDs (e.g., 6–8 characters).
- Ensures non-guessable, non-sequential URLs.
Paste Creation Flow
- User submits POST /paste
- Service generates paste_id, normalizes TTL
- Store paste in database (PostgreSQL or DynamoDB)
- Add entry to Redis (TTL = expires_in)
- Return short URL
Paste Retrieval Flow
- Client hits GET /paste/{paste_id}
- Check Redis cache
- If miss → read from DB → populate Redis
- Return paste content
Expiry Management
- Redis handles TTL-based eviction for short-lived pastes.
- Periodic background job runs to:
- Query DB for expires_at < now()
- Delete expired records
- Optional: archive expired pastes to cold storage (S3)
Abuse Protection
- Rate limit: max 10 pastes/min/IP (via API Gateway or Redis bucket)
- Paste size limit: 1MB max
- Visibility control: private/unlisted pastes not indexable
- Optional: add CAPTCHA or email verification for anonymous users
Capacity Estimation
Assumptions:
- 10M pastes/day
- Avg paste size = 1 KB
- Paste TTL = 7 days average
Storage:
- 10M × 1 KB = ~10 GB/day
- Retention = 7 days → 70 GB active data
- Replication overhead (×3) → ~210 GB
Traffic:
- Paste creation: ~120 QPS
- Paste access: ~1,200 QPS
- Redis hit ratio: 90%
- DB reads = ~120 QPS, DB writes = ~120 QPS
Scalability and Fault Tolerance
- Horizontal scaling of Paste API service behind a load balancer
- Redis Cluster for sharded caching
- PostgreSQL read replicas (or DynamoDB with Global Tables)
- API stateless → scales via ECS/Lambda/K8s
- CDN in front of API to cache popular pastes
- Health checks and failover for multi-AZ resilience
Trade-Offs and Alternatives
| DecisionAlternatives Considered | |
| Use Base62 short ID | vs sequential ID (less secure, guessable) |
| Redis TTL caching | vs edge CDN (CDN better for static public pastes) |
| PostgreSQL for DB | vs DynamoDB (less query flexibility) |
| Soft deletes | vs hard deletes for expiry cleanup |
Optional Features
- Syntax highlighting (Prism.js on frontend)
- User login / paste history
- Paste search (if public)
- Password-protected pastes
- GPDR-compliant deletion / export