Algorithm for finding the busiest period?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Finding the busiest period means identifying the time range with the highest concurrent activity, such as people in a store, active sessions, or requests in flight. This is a classic event-sweep problem that can be solved efficiently with sorted entry and exit events. A correct implementation must also define tie handling rules at identical timestamps.
Problem Model
Assume each record contains:
- timestamp
- count change
- direction or event type
Equivalent interval form can also be used, where each interval contributes one enter event and one exit event.
Goal:
- compute maximum active count
- report one busiest time point or busiest interval
Sweep-Line Algorithm
Core idea:
- convert raw data into signed events
- sort events by time
- traverse events and maintain running active count
- track maximum seen count and corresponding time
This avoids expensive nested comparisons and scales well.
Python Implementation for Event Logs
The function below handles logs where each row has timestamp, count, and is_entry flag.
The output gives a timestamp and peak active count.
Handling Full Busiest Interval, Not Just Point
Sometimes you need time range where occupancy remains maximal, not only first timestamp reaching maximum.
Approach:
- while scanning events, capture current count between current timestamp and next timestamp
- if current count equals max, merge that segment into busiest range set
This is useful for dashboards that display busiest window rather than single time point.
Tie-Breaking Rules Matter
When multiple events share same timestamp, ordering decisions affect counts.
Common conventions:
- exits before entries for half-open interval semantics
- entries before exits for closed-interval semantics
Choose one and keep it consistent across analytics and reporting code. Inconsistent tie rules create off-by-one occupancy discrepancies between services.
Complexity
With n events:
- sorting cost is O of n log n
- scan cost is O of n
- memory is O of n for event list
For very large streams, you can process in batches or use external sorting if events do not fit memory.
Streaming and Incremental Variants
If events arrive continuously, maintain ordered event structure or process in timestamp windows.
Practical streaming pattern:
- buffer events by watermark
- sort within completed window
- compute local peaks
- merge to global peak tracker
This balances latency and correctness in near real-time systems.
Data Quality Checks Before Processing
Validate input before algorithm:
- nonnegative counts
- known direction values
- monotonic timestamp assumptions if expected
- no impossible negative occupancy after applying events
Rejecting bad records early avoids misleading busiest-period results.
Alternative Interval Input Form
If input is interval list with start and end times, convert each interval to plus-one and minus-one events.
This is mathematically equivalent to occupancy event logs.
Common Pitfalls
A common pitfall is ignoring same-timestamp tie ordering and getting inconsistent peak counts. Another is treating all deltas as positive and forgetting to subtract exits. Teams also often skip input validation and allow negative occupancy states that should be impossible. Finally, reporting only timestamp of first peak can hide long sustained busiest windows needed by business users.
Summary
- Busiest period detection is a sweep-line event accumulation problem.
- Convert events to signed deltas, sort by time, and track running maximum.
- Define and document tie handling for identical timestamps.
- Extend algorithm to return peak intervals when required.
- Validate data quality so occupancy metrics remain trustworthy.

