My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach with Score: 9/10

by dewdrop_pinnacle261

System requirements


Functional Requirements:

  • Shorten a given long URL into a short, unique URL
  • Redirect to the original long URL when the short URL is accessed
  • Allow optional custom short URLs (like tinyurl.com/myalias)
  • Track basic analytics: number of clicks per short URL
  • Support expiration time on URLs (optional)
  • Provide API for shortening and expanding URLs

Non-Functional Requirements:

  • High availability
  • Low latency (~10-50ms for redirect)
  • Horizontal scalability to handle growth
  • Consistency: strong for creation, eventual for analytics
  • Durability: no loss of mappings on failures
  • Security: prevent abuse (rate limiting, spam detection)




Capacity estimation

Let’s assume:

  • 100M new URLs per month → ~40 URLs/sec
  • 1B total URLs stored after a few years
  • 10B redirection requests per month → ~4K requests/sec

Storage:

  • Each record ~500 bytes → 1B × 500B = ~500GB total
  • Write throughput ~40 ops/sec, Read throughput ~4K ops/sec






API design

POST /api/shorten

Body: { longUrl: "https://...", customAlias?: "myalias", expireAt?: "2025-12-31" }

Response: { shortUrl: "https://tinyurl.com/abc123" }


GET /:shortCode

Redirects to the original long URL


GET /api/analytics/:shortCode

Response: { clicks: 12345 }





Database design

Schema (Relational / NoSQL hybrid example):

FieldTypeNotes
short_codeString (PK)Unique 6-8 char code
long_urlStringOriginal URL
created_atTimestampCreation time
expire_atTimestampOptional expiration
click_countIntegerFor analytics


Indexes:

  • short_code → fast lookup
  • expire_at → periodic cleanup





High-level design

Components:

  • API Gateway / Load Balancer
  • URL Shortening Service → handles shorten requests
  • Redirect Service → handles GET /:shortCode
  • Database → persistent store (e.g., MySQL/Postgres, Redis for cache)
  • Cache Layer (Redis/Memcached) → speed up hot redirects
  • Analytics Service → tracks click counts asynchronously
  • Background Workers → cleanup expired URLs, process analytics






Request flows

Shorten Flow:

  1. Client → API Gateway → URL Shortening Service
  2. Validate long URL, check custom alias availability
  3. Generate short code (base62/random), store in DB
  4. Return short URL

Redirect Flow:

  1. Client → API Gateway → Redirect Service
  2. Check Redis cache → fallback to DB if cache miss
  3. Redirect (HTTP 302) to long URL
  4. Send click event to Analytics Service (async)

Analytics Flow:

  1. Analytics Service consumes events
  2. Batches updates to DB or stores in fast counter system






Detailed component design

a) Short Code Generator

  • Use Base62 (a-zA-Z0-9) → 6-8 char codes
  • Option 1: Random generation + check uniqueness in DB
  • Option 2: Incremental counter + Base62 encode
  • Option 3: Hash long URL + collision resolution

b) Cache Layer (Redis)

  • Store short_code → long_url mappings
  • Use TTL aligned with expire_at
  • Use LRU eviction

c) Analytics Service

  • Async event queue (Kafka/SQS)
  • Batch writes to DB or use Redis atomic counters
  • Daily materialization of click stats






Trade offs/Tech choices

  • Relational DB (MySQL) → strong consistency
  • Redis cache → fast redirects, reduce DB load
  • Async analytics → avoids slowing user redirects
  • Base62 encoding → short, URL-safe codes
  • Custom aliases → requires uniqueness checks, extra DB load





Failure scenarios/bottlenecks

DB failure: Use master-slave, replication, backups

Cache failure: Fallback to DB, rebuild cache on access

Hot key (popular short code): Use cache, sharded counters

Code collision: Check DB, retry generation

Network partition: Favor availability on reads, quorum writes for consistency

API abuse (spam): Add rate limits, CAPTCHA, user auth





Future improvements

Use CDN edge caching for global low-latency redirects

Add user accounts, link management dashboard

Support QR code generation

Add AB testing for link targeting

Use a distributed ID generator (e.g., Twitter Snowflake)

Cross-region replication for disaster recovery


Areas of Improvement

  • Algorithm detail: Improve collision resistance (e.g., Bloom filters)
  • Granularity: Per-region traffic monitoring
  • Dynamic adjustments: Autoscaling backend services, dynamic cache TTLs
  • Redis complexity: Use pipelining, Lua scripts for atomic operations


Potential Gaps / Edge Cases

  • Edge Cases:
    • Empty or invalid URLs
    • Expired links
    • Custom alias conflicts
  • Network Partitions:
    • Redis failover
    • Database write quorum handling


Areas of Depth for Next Level

  • Algorithm implementation:
    • Distributed ID generation
    • Conflict-free replicated data types (CRDTs) for click counts
  • Scalability enhancements:
    • Partition DB by short code prefix
    • Use of CDN edge nodes
  • Cross-region consistency:
    • Global DB clusters (e.g., CockroachDB)
    • Async replication with conflict resolution