Architect a low-latency Load Balancing Engine

Last updated: December 7, 2025

Quick Overview

Design a low-latency load balancing system that handles millions of requests. Discuss trade-offs in consistency, availability, and performance.

Expedia

System Design

Software Engineer

Expedia

December 7, 2025

Software Engineer

System Design Round

System Design

Easy

876 solved

Design a low-latency load balancing system that handles millions of requests. Discuss trade-offs in consistency, availability, and performance.

Expedia asks this during the System Design Round to assess your understanding of the full ML lifecycle. They want to see how you translate a business problem into an ML objective, design the feature pipeline, and plan for model monitoring and retraining.

What the Interviewer Expects

Map the business problem to a concrete ML objective
Propose reasonable features and a baseline model
Discuss basic model evaluation metrics
Outline a simple serving architecture

Key Topics to Cover

A/B testing and experimentation

Training pipeline and infrastructure

Monitoring and model degradation detection

Online vs offline evaluation

How to Approach This

Start by clarifying functional and non-functional requirements with the interviewer.
Estimate the scale: QPS, storage, bandwidth. This drives your design decisions.
Draw a high-level architecture first, then deep dive into 1-2 critical components.
Discuss trade-offs explicitly (e.g., consistency vs availability, SQL vs NoSQL).
Address failure scenarios, monitoring, and how the system handles 10x traffic spikes.

Possible Follow-up Questions

What is your model retraining strategy?
How would you handle the cold start problem?
How would you handle a 10x increase in prediction requests?

Practice a Similar Problem on Codemia

Solve a related problem with our interactive workspace, get AI feedback, and view detailed solutions.

Solve on Codemia

Sample Answer

Requirements

Functional Requirements

Request Routing: The load balancer must route incoming traffic to the most appropriate backend server based on predefined rules.
Health Checks: The system mus...

Capacity Estimation

To estimate capacity, consider the following:

Expected Peak Requests: Assume 10 million requests per second.
Average Request Size: Assume an average request size of 1KB.
**Total Data Thr...

Submit Your Answer

Markdown supported

Expedia Software Engineer Interview Guide

Interview process, tips, and preparation timeline