System requirements


Functional:

  1. Shortening URLs: The system should allow users to input long URLs and generate a unique, shorter alias for them.
  2. Redirection: When a user accesses the shortened URL, the system should redirect the user to the original long URL.
  3. Custom Alias: Optionally, allow users to choose a custom alias for their shortened URL.
  4. Link Expiration: Option to set an expiration date for short links, after which they are no longer accessible.
  5. Link Analytics: Provide analytics for short links, such as click counts, location of clicks, referral information, etc.
  6. API: Offer an API for integrating the service into other applications.
  7. User Authentication: Allow users to create accounts and manage their shortened URLs.
  8. Security: Ensure that shortened URLs are not easily guessable and protect against malicious activities like spam and phishing.


Non-Functional:

  1. Performance: The system should be able to handle a high volume of concurrent requests efficiently and provide fast redirection for shortened URLs. It should have low latency.
  2. Scalability: The system should be able to scale horizontally to accommodate increasing numbers of users and shortened URLs.
  3. Reliability: The service should be highly available and reliable, with minimal downtime.
  4. Security: The system should implement security best practices to protect against attacks like injection, phishing, and unauthorized access.
  5. Monitoring: Implement logging and monitoring to track system performance, user activity, and errors.
  6. Compliance: Adhere to legal requirements like GDPR to ensure user data privacy and protection.
  7. Usability: The user interface should be intuitive and easy to use for creating and managing shortened URLs.
  8. Backup and Recovery: Implement regular backups of data to prevent data loss in case of failures.
  9. Cost: The system should be cost-effective to operate, considering factors like server resources, storage, and bandwidth usage.





Capacity estimation


  1. DAU Number of users: 1million.
  2. Number of Shortened URLs: avg each user generate 1 shortened URL daily.
  3. Storage: Avg shortened URL is 5KB


-> daily storage requirement 1Million * 5KB ~= 5GB

-> TPS 1million /(24 *3600) ~= 12 TPS





API design

1.Create short URL from given one.

/shorten. POST body {"inputURL": "xxx"} response {"shortURL": "abc", Id: '123"}

2.Redirect by Id(optional)

/redirect/Id

3.Delete Short URL

/delete/id

4.Analytics

/analytics/id

topReferrer: xxx click: xxx


5.Security

Rate limit: based on ip/session id limit max queries per day avoid DDOS.

Validation: check user input URL with regex match before handle it.

Use HTTPS redirection: protect service from being intercepted.


6.User Registration

Allow users to sign up for an account by providing essential information such as email address and password. Optionally, you can offer signup via third-party services (e.g., Google, Facebook) for convenience.

  • Endpoint: /register
  • Method: POST
  • Body:
json Copy code { "email": "[email protected]", "password": "securePassword123" }

email verification

reset password





Database design

User Table

1.user id(key)

2.email

4.pwd

5.email verified boolean


URLs table

1.short URL(key)

1.id

2.user id

3.original URL

4.createdAt

5.ExpiresAt

6.shortCode



URL Visits Table

1.visit id(key)

1.short URL id

2.ReferrerIdAndClicks {"referrer1": 10, "referrer2": 2}

3. user agent

4. ip








High-level design

  1. web service: include front end UI for signin/signupsubmit requests.
  2. server API gateways
  3. URL shorten service
  4. DataBase
  5. redirection service
  6. analytics service





Request flows

User interactions:

  1. login/signup
  2. once login, user can submit

URL Shorten:

  1. The service check if given URL exist in database or not. if no create it. if yes, check userid if match.

URL redirection

  1. Given shortURL, lookup database and response/redirect to original URL

analytics

Each access to a shortened URL is logged by the Analytics Service, which updates visit counts, referrers, and other relevant data in the database for reporting to the user.



Detailed component design

Let's delve into the process of generating unique shortened URLs and handling collisions in the system


Generation strategy:

  1. random string
  2. hash encoding
  3. counter



pro and cons

  1. limited by string length for the system capacity.
  2. SHA256 encoding.
  3. predicable, security concern.




Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Database Overload

Scenario: High request rates could overload the database, slowing down read and write operations, leading to timeouts or errors.

Mitigation:

  • Caching: Implement caching layers to store frequently accessed data, reducing direct database hits.

Single Point of Failure (SPOF)

Scenario: If any component of the system (e.g., the API Gateway, database, or key generation service) is a single point of failure, its outage could make the entire service unavailable.

Mitigation:

  • Redundancy: Ensure all critical components have redundant instances in separate availability zones or regions.



Key Generation Collisions

Scenario: The method for generating unique identifiers for shortened URLs might lead to collisions, especially as the namespace gets crowded, impacting performance due to retries.

Mitigation:

  • Namespace Expansion: Consider using a longer key length as the service scales.





Future improvements

Caching: Maintain a cache of recently accessed or created short URLs to reduce database lookups for popular or newly generated URLs. LRU cache