Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...
for creating short url:
POST /urls/?alias
{
long_url:
}
response:{
status: 200 ok
short_url:
}
for redirect using short url:
GET /urls
{
short_url:
}
response:{
status: 302 temporary
message: redirected to orginal url
}
here status: 302 temporary, this will ensure that caching will increase redirect speed, as db need not be queried every time
temporary storage will ensure querying db after cache is evicted, so that the logs during redirect using db , can be used for monitoring and analyti
Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.
components:
creator
consumer
url shortener service
redirect service
api-gateway for authentication, authorization, rate limiting and request routing
for secure connections, https protocols
load balancing algo: least connections
CDN for lower latency.
to ensure uniques identifiers i'll store a counter in a db (redis)centrally accessible to all servers for url-shortening service and hash it for security.
then keep incrementing a counter on new url request.(even if there is any failure , we'll have redis cluster with replicas for failover).
we'll do exponential retries, with a jitter added.
if there is any failure with one of the services, we'll use circuit breaker to avoid cascading failures
separate pool resources i.e. bulkhead pattern to avoid one service impacting another.
to avoid race condition i'll use locking on db
cache
db for long-url to short-url mapping and other details
for redirect service, we'll first check if long url for that short url is already prsent in CDN, if yes we return data from there and redirect to right place
otherwise, CDN pulls the data from orginal server, updates itself and then returns the requested url for redirect
if redirect fails we analyze the lgos using analytics service and use it to improve the services
for caching we will use redis cache and cache details like short url to long url mapping, and if any alias is specified
in db we store details like user info, short url, long url, alias if provided.
Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.
to allow horizontal scaling, when there can be high reads or redirect requests, we have a load balancer
to reduce latency we will use read through caching for reads
for faster writes, we will do write through caching(eventual consistency)
we will use LRU cache eviction strategy
for monitoring and analytics, we will return response 302 for redirect instead of 301