algorithm
data analysis
busiest period
computing
time series

Algorithm for finding the busiest period?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Finding the busiest period means identifying the time range with the highest concurrent activity, such as people in a store, active sessions, or requests in flight. This is a classic event-sweep problem that can be solved efficiently with sorted entry and exit events. A correct implementation must also define tie handling rules at identical timestamps.

Problem Model

Assume each record contains:

  • timestamp
  • count change
  • direction or event type

Equivalent interval form can also be used, where each interval contributes one enter event and one exit event.

Goal:

  • compute maximum active count
  • report one busiest time point or busiest interval

Sweep-Line Algorithm

Core idea:

  1. convert raw data into signed events
  2. sort events by time
  3. traverse events and maintain running active count
  4. track maximum seen count and corresponding time

This avoids expensive nested comparisons and scales well.

Python Implementation for Event Logs

The function below handles logs where each row has timestamp, count, and is_entry flag.

python
1from typing import List, Tuple
2
3Record = Tuple[int, int, int]  # (timestamp, count, is_entry)
4
5
6def busiest_period(records: List[Record]) -> Tuple[int, int]:
7    events = []
8
9    for ts, count, is_entry in records:
10        delta = count if is_entry == 1 else -count
11        events.append((ts, delta))
12
13    # For equal timestamps, process exits before entries for half-open semantics.
14    events.sort(key=lambda x: (x[0], x[1]))
15
16    active = 0
17    best = float("-inf")
18    best_time = -1
19
20    for ts, delta in events:
21        active += delta
22        if active > best:
23            best = active
24            best_time = ts
25
26    return best_time, best
27
28
29sample = [
30    (1487799425, 14, 1),
31    (1487799425, 4, 0),
32    (1487799425, 2, 0),
33    (1487800378, 10, 1),
34    (1487801478, 18, 0),
35    (1487801478, 18, 1),
36    (1487901013, 1, 0),
37    (1487901211, 7, 1),
38    (1487901211, 7, 0),
39]
40
41print(busiest_period(sample))

The output gives a timestamp and peak active count.

Handling Full Busiest Interval, Not Just Point

Sometimes you need time range where occupancy remains maximal, not only first timestamp reaching maximum.

Approach:

  • while scanning events, capture current count between current timestamp and next timestamp
  • if current count equals max, merge that segment into busiest range set

This is useful for dashboards that display busiest window rather than single time point.

Tie-Breaking Rules Matter

When multiple events share same timestamp, ordering decisions affect counts.

Common conventions:

  • exits before entries for half-open interval semantics
  • entries before exits for closed-interval semantics

Choose one and keep it consistent across analytics and reporting code. Inconsistent tie rules create off-by-one occupancy discrepancies between services.

Complexity

With n events:

  • sorting cost is O of n log n
  • scan cost is O of n
  • memory is O of n for event list

For very large streams, you can process in batches or use external sorting if events do not fit memory.

Streaming and Incremental Variants

If events arrive continuously, maintain ordered event structure or process in timestamp windows.

Practical streaming pattern:

  • buffer events by watermark
  • sort within completed window
  • compute local peaks
  • merge to global peak tracker

This balances latency and correctness in near real-time systems.

Data Quality Checks Before Processing

Validate input before algorithm:

  • nonnegative counts
  • known direction values
  • monotonic timestamp assumptions if expected
  • no impossible negative occupancy after applying events

Rejecting bad records early avoids misleading busiest-period results.

Alternative Interval Input Form

If input is interval list with start and end times, convert each interval to plus-one and minus-one events.

python
1def busiest_from_intervals(intervals):
2    events = []
3    for start, end in intervals:
4        events.append((start, 1))
5        events.append((end, -1))
6    events.sort(key=lambda x: (x[0], x[1]))
7
8    active = 0
9    best = 0
10    best_time = None
11    for t, d in events:
12        active += d
13        if active > best:
14            best = active
15            best_time = t
16    return best_time, best

This is mathematically equivalent to occupancy event logs.

Common Pitfalls

A common pitfall is ignoring same-timestamp tie ordering and getting inconsistent peak counts. Another is treating all deltas as positive and forgetting to subtract exits. Teams also often skip input validation and allow negative occupancy states that should be impossible. Finally, reporting only timestamp of first peak can hide long sustained busiest windows needed by business users.

Summary

  • Busiest period detection is a sweep-line event accumulation problem.
  • Convert events to signed deltas, sort by time, and track running maximum.
  • Define and document tie handling for identical timestamps.
  • Extend algorithm to return peak intervals when required.
  • Validate data quality so occupancy metrics remain trustworthy.

Course illustration
Course illustration

All Rights Reserved.