Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
- low latency (<200ms)
- high availability
- scalability (to support lots of users and redirects)
API Design
//Create urls
POST /urls -> 201 - url
body {
url,
expirationDate
}
//redirect
GET urls/{sort_url} -> 302 - Original url
High-Level Design
The client requests to the an AWS api gateway an then the gateway routes the request for the right service through out a network load balancer. If it is a creation request, the shorter-service receive the necessary parameters to create a new short-url, after that a external service will create hash based on the timestamp and a global counter, to ensure the later generated url is unique, and the shorther-service will create the sort-url with this hash and finally persists it on DB, the service returns 201 as response.
For other side, if url-getter-service is demanded, this will looking for the short url in the cache by sort-url, once data is found, 302 is returned as response
Detailed Component Design
To ensure that the system is highly availailable and scalable, we are using a aws network loadbalancer, so millions of requests may be routed among the instances of our ECS Services. Its gonna help with spikes, most in getting redirection. Once apigateway receives a request, it directs the request to loadbalance, and then loadbalancer will route the request to the most availables instance.
Loadbalancer also help us to keep low latency working with caching strategy. We put a redis cache between url-getter-service and the DB. This cache is key-value db that is modeled to keep short-url - original-url info, this is gonna respond much faster to get url operations. In order to keep the cache updated, everytime the get-service does not finds a data in the cache, its gonna look for it in the postegres db, once it is found, the service is gonna persists in cache as well. Cache has TTL policy that matches the expiration date defined in the url creation.
The system is designed to support lots of reads and writes at same time once we have a loadbalance