Requirements


  • Functional Requirements
    • Apply Rate Limiting Policy per chosen Entity
    • To be able to change the Policy
    • enforce limits
  • Non-Functional Requirements
    • Scalability
    • Availability
    • Low Latency


API Design

  • /admin/configure={policy}
  • fn handle_request(req)
    • handle incoming request




High-Level Design

1. Client

  • Sends API requests with an API key
  • No logic here, just the source of traffic

2. API Rate Limiter Layer (distributed)

  • Multiple instances (horizontally scalable)
  • Acts like a gateway / middleware
  • Responsibilities:
    • Extract API key
    • Determine responsible shard (via hashing)
    • Check rate limit policy
    • Either:
      • forward request to backend
      • or reject with HTTP 429

3. Sharding / Routing Logic

  • Uses consistent hashing on API key
  • Maps each API key → specific limiter/Redis shard

This ensures:

  • same key always hits same shard
  • avoids global coordination
  • enables horizontal scaling

4. Redis (Rate Limit State Store)

  • Stores per-key counters / tokens
  • Acts as the real-time decision store

Typical data:

  • request counters
  • timestamps / tokens (depending on algorithm)

Characteristics:

  • in-memory → low latency
  • shared across limiter instances (per shard)

5. Persistence Layer (Database)

  • Stores durable state / backups of counters or configs
  • Used for:
    • recovery after Redis failure
    • long-term analytics (optional)

6. Policy Configuration API

  • Admin-facing endpoint (e.g. /admin/configure)
  • Defines:
    • rate limits (e.g. N requests/min)
    • burst capacity
    • algorithm type (fixed window, token bucket, etc.)

Key Design Decisions You Made

  • Distributed limiter (not single node) → scalability
  • Redis for fast state → low latency
  • Consistent hashing → avoids global coordination
  • Optional persistence → fault tolerance
  • Pluggable policy engine → flexibility




Detailed Component Design

1. API Rate Limiter

  • Core responsibility: apply rate limiting policy
  • Designed to support multiple policies (pluggable engine):
    • not fixed to one approach
    • can switch between strategies (e.g. fixed window, token bucket, etc.)
  • Policy is:
    • configurable
    • not hardcoded
    • applied based on configuration (from admin API)
  • You explicitly chose:
    • not to go deep into specific algorithms right now
    • keep it flexible and abstract
  • Key idea:
    • limiter acts as an execution engine
    • takes policy + request → produces allow/deny decision
  • IP Based enforcement for abusive requests

2. Database / Storage

  • Purpose:
    • store rate limiting state (e.g. counters)
  • You identified it as:
    • key-value based
    • simple structure (no complex schema)
  • Workload characteristics:
    • write-heavy
    • frequent updates per request
  • Requirement:
    • should have good write performance
  • Conclusion you made:
    • a simple, high-performance key-value store is sufficient