Tinyurl and race conditions
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
A URL shortener looks simple until multiple requests arrive at the same time. The core problem is concurrency: two workers may both think the same short code is free, or both try to create a mapping for the same long URL, and the loser of that race can corrupt data unless the storage layer enforces uniqueness.
Where the Race Condition Appears
A typical shortener does three things:
- decide whether a long URL already has a short code
- generate or reserve a new code if needed
- save the mapping
The race happens when two requests interleave those steps. If both requests check first and insert second, they can both conclude that the code or URL is unused.
This pseudo-flow is unsafe:
- request A checks whether code
abc123exists - request B checks whether code
abc123exists - both see nothing
- both insert
abc123
Even if your code generator is good, the check-then-insert sequence is still vulnerable under concurrency.
The Correct Fix: Let the Database Enforce Uniqueness
The database should own uniqueness, not application timing. Create unique constraints on the short code and, if you want one canonical short link per long URL, on the long URL as well.
The runnable Python example below uses SQLite to show the pattern. It relies on a unique index and retries when the chosen code collides.
The important point is not SQLite. The important point is that the insert is authoritative. If another request wins the race, the database rejects the conflicting insert and your code recovers cleanly.
Choosing an ID Strategy
There are several safe approaches:
- random codes with a unique constraint and retry on collision
- numeric ids from a sequence, then encode them in base62
- deterministic mapping from long URL to a stable identifier
Sequence-based ids are easy to reason about because the database guarantees uniqueness. Random codes distribute well and hide volume better, but they still need collision handling. Deterministic hashing can work, but collisions and canonicalization rules become your responsibility.
The Real Design Question
The hard question is often not “how do I generate a code” but “do I want the same long URL to always return the same short URL.” If yes, make long_url unique and return the existing row when it is already present. If no, allow duplicate long URLs and only enforce uniqueness on the short code.
That business rule changes the schema and the race-condition handling.
Common Pitfalls
The classic mistake is solving concurrency with an in-memory lock in one application process. That works only until you run multiple instances. The database constraint is the only shared authority all workers can rely on.
Another mistake is checking first and assuming the later insert is safe. That turns the race into a timing problem instead of a data-integrity rule.
Teams also forget URL normalization. https://example.com and https://example.com/ may or may not be the same URL for your product. Decide before enforcing uniqueness on the original string.
Summary
- URL shorteners are concurrency problems as much as they are encoding problems.
- The unsafe pattern is check first, then insert without a uniqueness constraint.
- Put unique constraints in the database and treat insert failures as expected outcomes.
- Choose whether one long URL maps to one short URL or many before designing the schema.
- Application-level locks are not enough once the system runs on multiple processes or servers.

