My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach

by flare3451

Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.



Non-Functional Requirements:


  • Scalability: horizontal scaling
    • Load
      • 500 million short url per day = 6000 write rps
      • 1 billion long url per day = 12000 rps
    • Storage
      • Retention 1 week
      • 100 bytes per url
      • 350 GB
  • Availability - highly available - 99.99 uptime
  • Consistency - NA since we have immutable data.
  • Reliable - Durable data, cannot afford losing data
  • Latency - low latency for good user experience. < 100 ms


API Design


APIs

  • create short url
    • POST /shorten {longUrl: xyz.com} returns {shortUrl: xyz}
  • redirect
    • GET /{shortUrl} returns HTTP 1.1 302 Found Location: xyz.com


Database

  • MongoDb
    • shortUrl, longUrl, createdDate
  • SQL to keep 3.5 trillion unique short urls


High-Level Design

Shorten:

  1. shorten requests from the clients go through Edge Servers so that they cache the response locally
  2. the request goes through load balancer for load distribution and rate limiting and land on Shortening Service
  3. Shortening service gets unused short url form the sql, saves it to Database and responds to the client
  4. It asynchronously caches the short url.
  5. It asynchronously sends data to analytics.

Redirect:

  • redirect requests are handled by edge servers on a cache hit
  • else it forwards the request to a redirect service
  • redirect service fetches the response from a cache, if missed gets it from db.
  • asynchronously sends data to analytics


Short url generation:

  • with 7 words length we can have 3.5 Trillion unique short urls
  • we will store them in a SQL store with a boolean to indicate if the url is used or not
  • retrieve unused url and assign it to the long url



Detailed Component Design


  • Low latency: geographically distributed Edge Servers that also act as a read through cache
  • Rate Limiting: Use Token Bucket Algorithm with UserId as the key identifier. This handles burst traffic.
    • Example configuration - 5 tokens per user with Retry After one minute.
  • Load Distribution: Distribute based on least connections active.
  • Auto Scaling: Application servers configured to auto scale (scale up when >60% cpu, scale down when <30% cpu)
  • HotSpot mitigation:
    • data skew: mitigated by using high cardinality key like shortUrl
    • access skew: mitigated by using cache with LFU eviction
  • ID generation:
    • collision: avoided by using a SQL database which consists of all unique urls using 7 alphanumeric letters. All access to this data is transactional which avoids ID collisions. We can have a one master multiple replica setup where writes go through master. When master goes down one of the replicas will be elected as the master based on quorum mechanism to avoid any split brain scenarios.
    • predictable: shortUrls are not predictable since they are randomly assigned from a pool of unique urls.
  • Caching:
    • Invalidation: When a long url is updated or deleted, we refresh that data from the cache.
    • Eviction: LFU with TTL
    • Cache Miss scenario: Use a cache aside strategy to write to cache. On cache miss, application reads the data from DB and updates the cache.
    • Hot Keys: In order to handle popular short links, we can follow below strategies
      • Edge Servers: will be the first line of defence for handling hot keys.
      • Local Caching: The redirect service will have a scheduler to get most popular short links and hold it in their RAM.
      • Request Coalescing: Allow one request to rebuild cache, others wait