Return the long URL associated with a given short URL.
Non-Functional Requirements:
High Availability
Low-latency redirects
API Design
POST /shortendURL passing the long URL in the body of the request
GET /<short_hash> returns the long url to redirect to
High-Level Design
Application to be hosted in a cross-region autoscaling group behind a loadbalancer
SSL certificated attached to the LB provide encryption in transit, SSL terminates at the LB.
The application would take a URL and append a random salt to the url, then use take an MD5 hash of the concatenated string, and then truncate the url
hashes from URLs that are created are first checked against the key/value db for existing urls, and collision checks. If the values are identical, then it's a duplicate -> just return the already generated hash, if the values are different then it's a collision, append a new random salt to the original url and repeat the process
The key would be stored as the shortened hash, and the value is the long url.
When user does a GET with the short hash, the application checks the key/value store with the hash in the url and then returns a 3xx redirect with the 'Location' header set to the value in the hash table. if no value in the key/value store then return a 404 not found
the key/value store is responsible for storing the shortened hashes as keys and the long urls as values, this is essentially a hash table which is incredibly quick to query, meaning latency would be low.
We could add read replicas of the key/value store to cope with scaling
HA at the db level would require a more complex distributed data layer spread over multiple regions, with master writer being elected out of the group of writers
Detailed Component Design
The client would be a lightweight web frontend that with a text box you can paste a long url into, then it will return the newly created short URL e.g. https://<domain>/<short_hash>
the serverside application would have two endpoints:
POST /shortenedURL -> returns a string https://<domain>/<short_hash>
/<short_hash> -> returns a 3xx redirect response to the long url as the 'Location' header, which forces the user's browser to redirect to that long url.
cache layer -> key/value store would be a redis cache