System requirements
Functional:
- Create short URL
- Read the short URL and redirect to complete URL
- Collect metrics for each short-url created
Non-Functional:
- Unique Short code generated by base62 algo
- In-memory low-latency to cache short URLs
- Short-URL is persistent async, returning the code to be used as-soon-as-possible
- Use load-balancer to scale through multiple instances, adding high availability once instances reach a max load
- Use a gateway-api to to rate-limit clients before hitting backend
API design
- POST /short-url -> responsible to create the short-URL, sync request
- GET /:code -> read the short-url created before and returns/redirect to the complete URL
- GET /statistics/:code -> provide statistics report by short-URL code (how many access, ...)
High-level design
- Create endpoint generates a unique short-code and returns it as-soon-as-possible, save on redis, key as code and complete-url as value. We use a Redis INCR command as seed to produce unique global codes.
- Creation process works in background via queues to write on database in batches.
- Latest created short-URL as saved on cache, the key is based
- Read endpoint access Redis cache first, using the idempotent key. If cache miss, then go to database
Detailed component design
- code-generator-service: used in the create url flow. Generates a unique code, save on redis and return 202 status code with short url.
- url-writer-service: listen pubsub topic, pull messages in batch and write on postgres in batches
- url-query-service: read from redis based on code, if cache miss, fetch data from replica postgres database written by url-writer-service
Database design
Postgres database to provide long-term history of short-url:
- Table short_urls
- id int sequential (PK)
- full_url text not null
- code varchar (20) not null -> index btree
- reference_id varchar (50) -> unique idempotency key
- expires_at datetime nullable -> used to ignore on expired read
- created_at
Postgres database to provide complete report history:
- Table urls
- id int sequential (PK)
- code varchar (20) not null -> index btree
- read_count bigint default 0
- created_at