My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach

by zephyr7619

Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.



Non-Functional Requirements:


  • Available
  • Scalable
  • Reliable


API Design

There are two main APIs 1. Shorten the given URL 2. Retrieve the URL from tiny URL

POST: /v1/tinyurl

Takes the long URL as input and gives response as tinyurl.

GET: /v1/tinyurl

Takes the tinyurl as input and returns the normal URL.



High-Level Design

The main components required to build tiny url are

  1. CDN: This is first level cache that can be used for retrieving a popular tiny URL. Having CDN, will avoid the unnecessary routing to the actual service call.
  2. Load Balancer: This will decide which instance of the service to be called depending upon the availability and health check.
  3. API Gateway: This will act as front end service that will remove the unnecessary encryption and act as tunneling.
  4. Rate limiter: This will be used to limit the number of requests per user.
  5. Write service: This will perform the actual shortening of the long URL to tiny URL. This service will be stateless, and will be invoked when we call the POST call for the shortening URL. This service can use many algorithms for shortening the URL. Once the tiny URL is created it will be stored in the Database. To generate unique tiny URL we can use hash algorithm that will generate unique key. This key can be used as tiny url. We can also an unique ID generator that can be converted into base64 method.
  6. Read Service: This will be invoked when the user request the long URL from the tiny URL. First it will check in the cache. If it exists then it will return else it will hit the database to get the long URL. Once the URL is retrieved then we will send the URL as response with HTTP code as 301 or 302 so that the browser will re-directed to the actual URL.
  7. Database: Here we are using relational Database to store the mapping of Long URL to tiny URL. There can two main tables. For relatively large data we can shard the database based on the tiny URL. Either we can use consistent hashing on tiny URL or starting characters to shard the table. We can also replicas for the database to provide fault tolerant. Whenever we create the new Tiny URL we will cache that record with TTL of 1 day, because it will be used by many end users initially. If the record is not available in the cache then we check in the database. We will query the table where tiny URL will be the user given URL.

URL: It will have below attributes.

    1. PrimaryKey: It can be the tiny url.
    2. Main/Normal URL.
    3. Creation Date
    4. Expiry Date.

User Details:

    1. Username
    2. Name
    3. email
    4. Location

For scaling the database when there is more data we can shard the table based on the tiny URL key.Since this system is read-heavy type we can shard and replicate the tables across the globe.


Detailed Component Design


Write Service: This is the main service that will generate the tiny URL for the given long URL. We can use hash algorithm that will generate unique key or we can use an unique ID generator that can be converted into base64 method. For hash key algorithm we can use CRC-32, MD5, SHA-1 etc algorithms. There are chances of collision here. During collision we can add some junk chars so that we can get unique key. But during the retrieval we need to remove those junk chars. We can also use bloom filters to get unique key.


the other methodology is to use unique ID generator and convert that into base64 so that we can get unique chars instead of numbers.


Read Service: To get the long URL from the tiny URL we can have cache that will have less latency. Also since its read heavy system we can have more instances of read service. We can also have more database replicas per region so that we can improve the performance.