My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach with Score: 9/10

by pinnacle6561

System requirements


Functional Requirements:

  1. Shortening URLs: Allow users to input long URLs and receive a shortened URL.
  2. Redirecting: Redirect users from the shortened URL to the original URL.
  3. Custom Aliases: Allow users to create custom aliases for their shortened URLs.
  4. Analytics: Provide analytics for each shortened URL, such as click counts and geographical locations.
  5. API Access: Provide a public API for creating and retrieving shortened URLs.

Non-Functional Requirements:

  1. Scalability: The system should handle a high volume of URL shortening requests and redirections.
  2. Performance: Redirections should be fast with minimal latency.
  3. Availability: The service should be highly available with minimal downtime.
  4. Security: Ensure that the service is secure and protected against misuse (e.g., spamming, malicious URLs).
  5. Reliability: The service should reliably store and retrieve URL mappings.




Capacity estimation

Daily Requests: Estimate of 10 million requests per day.

Storage: Assume 1 KB per URL mapping, storing 100 million URLs would require around 100 GB of storage.

Traffic: Peak traffic handling of 1000 requests per second.




API design

1. Create Short URL:

POST /api/shorten { "longUrl": "http://example.com/very/long/url", "customAlias": "optional-custom-alias" } Response: { "shortUrl": "http://short.url/abcd1234" }

2. Retrieve Original URL:

GET /api/original/:shortUrl Response: { "longUrl": "http://example.com/very/long/url" }

3. Get URL Analytics:





Database design

Entities:

  1. URL:
  • id (PK)
  • longUrl (String)
  • shortUrl (String, Unique)
  • customAlias (String, Unique, Optional)
  • createdAt (DateTime)
  • clicks (Integer)
  1. Click:
  • id (PK)
  • shortUrl (FK)
  • timestamp (DateTime)
  • geoLocation (String)

ER Diagram:



High-level design

Components:

  1. API Gateway: Handles all incoming requests.
  2. URL Shortening Service: Processes requests to shorten URLs and generate custom aliases.
  3. Redirection Service: Redirects users from shortened URLs to the original URLs.
  4. Database: Stores URL mappings and click data.
  5. Analytics Service: Collects and processes click data for analytics.






Request flows

Shortening URL:

  • Client sends a POST request to the API Gateway with the long URL.
  • The API Gateway forwards the request to the URL Shortening Service.
  • The URL Shortening Service generates a short URL and stores it in the Database.
  • The short URL is returned to the client.

Redirection:

  • User clicks the short URL.
  • The Redirection Service queries the Database for the long URL.
  • The user is redirected to the long URL.
  • Click data is recorded and sent to the Analytics Service.


sequenceDiagram

  Client->>API Gateway: POST /shorten

  API Gateway->>URL Shortening Service: Create short URL

  URL Shortening Service->>Database: Store URL mapping

  Database->>URL Shortening Service: URL stored

  URL Shortening Service->>API Gateway: Short URL

  API Gateway->>Client: Return short URL

   

  User->>Redirection Service: Access short URL

  Redirection Service->>Database: Get long URL

  Database->>Redirection Service: Return long URL

  Redirection Service->>User: Redirect to long URL

  Redirection Service->>Analytics Service: Log click data




Detailed component design

1. URL Shortening Service:

  • Hashing Algorithm: Uses a hashing algorithm to generate unique short URLs.
  • Collision Handling: Check for collisions and regenerate the hash if necessary.

2. Redirection Service:

  • Caching: Implement caching to speed up redirection times for frequently accessed URLs.
  • Load Balancing: Distribute requests across multiple servers to handle high traffic.

3. Analytics Service:

  • Real-time Processing: Collect click data in real-time for up-to-date analytics.
  • Batch Processing: Process and aggregate click data periodically for detailed reports.





Trade offs/Tech choices

Hashing vs. Sequential IDs: Hashing provides better distribution but might have collisions; sequential IDs are simple but predictable.

SQL vs. NoSQL: SQL provides strong consistency; NoSQL provides better scalability.



Failure scenarios/bottlenecks

Database Failures: Use replication and backups to prevent data loss.

Service Downtime: Implement auto-scaling and load balancing to handle traffic spikes.

Data Corruption: Regular data integrity checks and audits.




Future improvements

Custom Domains: Allow users to use their own domains for short URLs.

Enhanced Security: Implement features like rate limiting and URL validation.

Advanced Analytics: Provide more detailed analytics, such as user agent and referrer tracking.