System requirements
Functional:
- user submits long url, gets short url back
- user navigates to long url, gets redirected to short url
- track analytics for clicked links
- custom short url value
out of scope:
- editing previous redirects
- authentication
- display analytics
Non-Functional:
List non-functional requirements for the system...
- high availability >> consistency
- fast creation and resolution of redirects, < 100ms
- scalable to handle surges in traffic and growth
Capacity estimation
dau
- 500M internet users
- 1% user url shorteners
- 20% market share
- 1 url shortened per day
- 1M DAU for writes
- 10 tps
- 10:1 read/write ratio
- 100 qps
storage
- 1M redirects saved per day
- 1K for redirect & short url etc
- 1GB data per day
- *2 for redundancy
- *2 for backups
- 400 day/year
- 1.6 TB per year
API design
create redirect:
POST /api/v1/redirect/create
body: {
long_url: string,
short_path?: string,
}
response: {
short_path: string,
}
error: 409, short_path already exists
GET /:short_path
=> 302 temporary redirect to long url
error => 404 not found
note: 302 instead of 301 for better analytics. trade-off higher load
Database design
short_url
- short_path: string (primary key)
- long_url: string
- click_count: int
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design...
Request flows
create redirect:
- POST /api/v1/redirect/create
- API Gateway -> LB -> Shorten Service
- generate short_path if not provided
- write to DB if not exist
- success?
- reply to user
- fail?
- error message to user if custom short_path
- retry N times if generated short_path, else error message to user
consume redirect
- GET /:short_path
- AG -> LB -> Redirect Service
- aggregate click count
Detailed component design
unique short_url generation:
- use lower case + digits for easy typing: 26+10 = base 36
- 8 chars at base 36 will give about 3T possibilities
- 3M days at 1M per day
- => low chance of conflict and retries
- can ensure uniqueness across nodes by generating from:
- 8 bits for server id
- 40 bits for timestamp
- 16 bits for increment reset every second
- => convert to base 36
scale write speeds:
- for better write speeds can partition or shard DB on :short_path
- when sharding, we can partition by hash ranges to avoid short_path conflicts across shards
scale read speeds:
- read replicas
- cache for redirect service
click aggregator:
- send click info to durable stream like kafka
- worker aggregates clicks for short urls and batch updates click_count in db
Trade offs/Tech choices
DB
- postgres for DB, optimized for read vs writes.
- key-value store like DynamoDB could've also easily worked
- SQL allows transactions or more complex queries for future updates
- data is highly structured and app is read-heavy, also suitable for b-tree DB
hash
- only lower case. upper case would give base 62 and allow for shorter urls
- but only 1 character shorter and much worse UX for user, especially mobile bc random uppercase chars can be tedious to type
Availability >> Consistency for reads
- eventual consistency is acceptable for handling redirects
- unavailable service would have much worse UX than temporarily stale data
Consistency >> Availability for writes
- not acceptable to ignore :short_path conflict
- users should have confidence that their :short_path wasn't reused by other users
- reads are eventually consistent, which is a good trade-off since it allows higher availability.
Failure scenarios/bottlenecks
- write speeds can be a bottle neck
- prioritizing availability over consistency, users can get temporarily stale data
Future improvements
- can improve write speeds with partitioning
- allow editing urls
- more advanced analytics