Codemia | Master System Design Interviews Through Active Practice

My Solution for Design a Video Conferencing system

by nectar4678

System requirements

Functional:

User Authentication and Authorization:

Users should be able to register, login, and manage their accounts.
Implement role-based access control for different user types (e.g., admin, host, participant).

Meeting Scheduling and Management:

Users should be able to schedule, edit, and cancel meetings.
Integration with calendar applications (e.g., Google Calendar, Outlook).

Joining Calls:

Users should be able to join meetings via a link or meeting ID.
Support for joining from different devices (desktop, mobile, web).

Video and Audio Communication:

High-quality video and audio transmission.
Support for muting/unmuting audio and turning on/off video.

Content Sharing:

Users should be able to share their screens or specific application windows.
Support for sharing files and media during the meeting.

Chat Functionality:

Real-time text chat within the meeting.
Private and group chat options.

Recording and Playback:

Ability to record meetings and save them for future playback.
Secure access to recorded meetings.

Participant Management:

Host controls for managing participants (e.g., muting, removing).
Participant list display with roles and statuses.

Security Features:

End-to-end encryption for all communications.
Secure meeting links and passwords for meeting access.

Notifications:

Email and push notifications for meeting reminders and updates.

Non-Functional:

Scalability:

The system should be able to handle thousands of concurrent users.
Support for horizontal scaling to manage load

Reliability:

High availability with minimal downtime.
Failover mechanisms in case of server failures.

Performance:

Low latency for real-time communication.
Efficient resource utilization to maintain quality.

Usability:

Intuitive and user-friendly interfaces.
Consistent user experience across different devices.

Compatibility:

Cross-platform support (Windows, macOS, Linux, iOS, Android).
Browser compatibility (Chrome, Firefox, Safari, Edge).

Security:

Compliance with data protection regulations (e.g., GDPR).
Regular security audits and updates.

Maintainability:

Modular architecture for easy updates and maintenance.
Comprehensive logging and monitoring.

Extensibility:

Ability to add new features and integrate with other services.
API availability for third-party integrations.

Capacity estimation

User Traffic

Concurrent Users:
Small to medium scale: 1,000 - 5,000 concurrent users.
Large scale: 10,000 - 50,000 concurrent users.
Peak Usage:
Assume peak usage occurs during working hours, with a significant increase in the number of users during meetings.

Bandwidth Requirements

Video Streaming Bandwidth:
A standard video call requires approximately 1.5 Mbps per participant.
For 10,000 concurrent video streams:
10,000 users×1.5 Mbps=15,000 Mbps=15 Gbps10,000 \text{ users} \times 1.5 \text{ Mbps} = 15,000 \text{ Mbps} = 15 \text{ Gbps}
10,000 users×1.5 Mbps=15,000 Mbps=15 Gbps
Audio-Only Calls:
Audio calls require about 0.1 Mbps per participant.
For 10,000 concurrent audio streams:
10,000 users×0.1 Mbps=1,000 Mbps=1 Gbps10,000 \text{ users} \times 0.1 \text{ Mbps} = 1,000 \text{ Mbps} = 1 \text{ Gbps}
10,000 users×0.1 Mbps=1,000 Mbps=1 Gbps

Meeting Handling Capabilities

Meeting Rooms:
Assume each meeting room can handle up to 100 participants.
For 10,000 concurrent users, we need:
10,000 users100 users/room=100 rooms\frac{10,000 \text{ users}}{100 \text{ users/room}} = 100 \text{ rooms}
100 users/room
10,000 users
=100 rooms
To handle spikes, provision for 150 rooms.

Server Capacity

CPU and Memory:
Estimate based on the number of concurrent users and the complexity of tasks.
Each video stream may need around 0.5 CPU cores and 512 MB of RAM.
For 10,000 users:
10,000 users×0.5 CPU cores=5,000 CPU cores10,000 \text{ users} \times 0.5 \text{ CPU cores} = 5,000 \text{ CPU cores}
10,000 users×0.5 CPU cores=5,000 CPU cores
10,000 users×512 MB RAM=5,120 GB RAM10,000 \text{ users} \times 512 \text{ MB RAM} = 5,120 \text{ GB RAM}
10,000 users×512 MB RAM=5,120 GB RAM
Server Instances:
Use cloud infrastructure (e.g., AWS, Azure) for scalability.
For example, using AWS c5.large instances (2 vCPUs, 4 GB RAM):
5,000 CPU cores2 cores/instance=2,500 instances\frac{5,000 \text{ CPU cores}}{2 \text{ cores/instance}} = 2,500 \text{ instances}
2 cores/instance
5,000 CPU cores
=2,500 instances
5,120 GB RAM4 GB/instance=1,280 instances\frac{5,120 \text{ GB RAM}}{4 \text{ GB/instance}} = 1,280 \text{ instances}
4 GB/instance
5,120 GB RAM
=1,280 instances
Therefore, provision around 2,500 instances for CPU needs.

Storage Requirements

Recording and Data Storage:
Assume each recorded meeting hour is approximately 1 GB.
For 10,000 hours of recordings per day:
10,000 hours×1 GB/hour=10,000 GB/day=10 TB/day10,000 \text{ hours} \times 1 \text{ GB/hour} = 10,000 \text{ GB/day} = 10 \text{ TB/day}
10,000 hours×1 GB/hour=10,000 GB/day=10 TB/day
Provision storage for at least 30 days:
10 TB/day×30 days=300 TB10 \text{ TB/day} \times 30 \text{ days} = 300 \text{ TB}
10 TB/day×30 days=300 TB

API design

User Management APIs

Register User

Endpoint: POST /api/v1/users/register
Request:
{
    "username": "john_doe",
    "email": "[email protected]",
    "password": "securepassword123"
}
Response:
{
    "user_id": "12345",
    "username": "john_doe",
    "email": "[email protected]",
    "created_at": "2024-07-30T12:34:56Z"
}

Login User

Endpoint: POST /api/v1/users/login
Request:
{
    "email": "[email protected]",
    "password": "securepassword123"
}
Response:
{
    "token": "jwt_token_here",
    "user_id": "12345",
    "expires_in": 3600
}

Get User Profile

Endpoint: GET /api/v1/users/{user_id}
Request Header:
Authorization: Bearer jwt_token_here
Response:
{
    "user_id": "12345",
    "username": "john_doe",
    "email": "[email protected]",
    "created_at": "2024-07-30T12:34:56Z"
}

Meeting Management APIs

Schedule Meeting

Endpoint: POST /api/v1/meetings
Request:
{
    "title": "Team Sync",
    "start_time": "2024-07-31T10:00:00Z",
    "end_time": "2024-07-31T11:00:00Z",
    "participants": ["user_id_1", "user_id_2"],
    "host_id": "12345"
}
Response:
{
    "meeting_id": "67890",
    "title": "Team Sync",
    "start_time": "2024-07-31T10:00:00Z",
    "end_time": "2024-07-31T11:00:00Z",
    "host_id": "12345",
    "participants": ["user_id_1", "user_id_2"],
    "join_url": "https://video.example.com/meetings/67890"
}

Get Meeting Details

Endpoint: GET /api/v1/meetings/{meeting_id}
Request Header:
Authorization: Bearer jwt_token_here
Response:
{
    "meeting_id": "67890",
    "title": "Team Sync",
    "start_time": "2024-07-31T10:00:00Z",
    "end_time": "2024-07-31T11:00:00Z",
    "host_id": "12345",
    "participants": ["user_id_1", "user_id_2"],
    "join_url": "https://video.example.com/meetings/67890"
}

Cancel Meeting

Endpoint: DELETE /api/v1/meetings/{meeting_id}
Request Header:
Authorization: Bearer jwt_token_here
Response:
{
    "message": "Meeting cancelled successfully"
}

Video Call APIs

Join Meeting

Endpoint: POST /api/v1/meetings/{meeting_id}/join
Request:
{
    "user_id": "12345"
}
Response:
{
    "meeting_id": "67890",
    "user_id": "12345",
    "join_url": "https://video.example.com/meetings/67890/join"
}

Start Recording

Endpoint: POST /api/v1/meetings/{meeting_id}/recording/start
Request:
{
    "user_id": "12345"
}
Response:
{
    "message": "Recording started"
}

Stop Recording

Endpoint: POST /api/v1/meetings/{meeting_id}/recording/stop
Request:
{
    "user_id": "12345"
}
Response:
{
    "message": "Recording stopped"
}

Send Chat Message

Endpoint: POST /api/v1/meetings/{meeting_id}/chat
Request:
{
    "user_id": "12345",
    "message": "Hello, everyone!"
}
Response:
{
    "message_id": "98765",
    "user_id": "12345",
    "message": "Hello, everyone!",
    "timestamp": "2024-07-30T12:45:00Z"
}

Database design

High-level design

Key Components

Client Applications

Web Client
Mobile Client (iOS and Android)

API Gateway

Routes requests to the appropriate microservices.

Authentication Service

Handles user authentication and authorization.

User Service

Manages user profiles and accounts.

Meeting Service

Manages meeting scheduling, details, and participant information.

Video Streaming Service

Handles real-time video and audio communication.

Chat Service

Manages real-time chat within meetings.

Recording Service

Manages recording of meetings and storing playback data.

Notification Service

Sends email and push notifications for meeting reminders and updates.

Database

Stores all persistent data (users, meetings, chat messages, recordings).

Storage Service

Manages storage for recorded meetings and other large files.

Monitoring and Logging

Monitors system performance and logs activities for auditing and debugging.

Component Interactions

Client Applications: Users interact with the system via web and mobile clients to schedule meetings, join calls, chat, and more.
API Gateway: Central entry point for all client requests, routing them to appropriate microservices.
Authentication Service: Authenticates users and issues tokens for secure access.
User Service: Manages user data and profiles.
Meeting Service: Handles scheduling, managing, and retrieving meeting information.
Video Streaming Service: Ensures real-time video and audio transmission using protocols like WebRTC.
Chat Service: Manages real-time messaging during meetings.
Recording Service: Handles the start/stop of meeting recordings and stores the recordings.
Notification Service: Sends out notifications for meeting-related events.
Database: Central repository for all application data.
Storage Service: Stores large files such as recorded meetings.
Monitoring and Logging: Monitors and logs the system’s activities for performance tracking and debugging.

Request flows

User Registration Flow

Scheduling a Meeting Flow

Joining a Meeting Flow

Sending a Chat Message Flow

Detailed component design

Video Streaming Service

Functionality:

Real-time video and audio communication.
Ensures low-latency, high-quality streams using WebRTC.

Architecture:

Media Servers: Handle video and audio streams, using protocols like RTMP/RTSP for transmission.
Signaling Server: Facilitates the setup of WebRTC connections between clients.
TURN/STUN Servers: Assist in NAT traversal for establishing peer-to-peer connections.

Scalability:

Horizontal Scaling: Media servers can be scaled horizontally to handle more concurrent streams.
Load Balancing: Distribute the load among multiple media servers to ensure even resource utilization.

Key Algorithms/Data Structures:

WebRTC: For peer-to-peer video and audio communication.
SFU (Selective Forwarding Unit): For routing media streams to multiple participants efficiently.

Meeting Service

Functionality:

Manages meeting scheduling, participant information, and meeting details.
Provides APIs for creating, updating, and deleting meetings.

Architecture:

Meeting Controller: Handles API requests for meeting operations.
Meeting Manager: Manages meeting state and interactions with the database.
Notification Manager: Sends notifications related to meetings.

Scalability:

Database Sharding: Partition the database to handle large volumes of meeting data.
Caching: Use caching mechanisms (e.g., Redis) to speed up retrieval of frequently accessed data.

Key Algorithms/Data Structures:

Event Scheduling Algorithm: Efficiently schedule meetings, avoiding conflicts.
UUIDs: For unique identification of meetings and participants.

Chat Service

Functionality:

Manages real-time text chat within meetings.
Supports private and group chats.

Architecture:

Chat Controller: Handles API requests for sending and receiving messages.
Message Processor: Processes and stores chat messages.
Real-time Messaging: Uses WebSockets for real-time message delivery.

Scalability:

Message Queues: Use message queues (e.g., RabbitMQ) to handle high-throughput message processing.
Horizontal Scaling: Scale chat servers horizontally to manage increasing load.

Key Algorithms/Data Structures:

WebSocket Protocol: For real-time communication.
Pub/Sub Model: For broadcasting messages to multiple recipients.

Trade offs/Tech choices

Real-time Communication vs. Resource Utilization

Trade-off: Real-time video and audio communication require significant bandwidth and processing power.
Choice: Use WebRTC for peer-to-peer communication to reduce server load and latency, but provision TURN servers for scenarios where direct peer-to-peer communication is not possible.

Complexity vs. Maintainability

Trade-off: A more complex system can offer more features and finer control but may be harder to maintain and debug.
Choice: We chose a microservices architecture, which introduces complexity but significantly improves maintainability and scalability. Each service can be developed, deployed, and scaled independently.

Performance vs. Scalability

Trade-off: High performance often requires more specialized, high-performance hardware, while scalability favors distributed systems and cloud-based solutions that can grow horizontally.
Choice: The system is designed for horizontal scaling using cloud infrastructure (e.g., AWS, Azure). This allows us to handle a large number of concurrent users by adding more instances rather than relying on high-performance hardware.

Failure scenarios/bottlenecks

Security Breaches

Scenario: Unauthorized access or data breaches can compromise user data and system integrity.
Mitigation:
Implement end-to-end encryption for all communications.
Use secure authentication mechanisms (e.g., JWT, OAuth).
Conduct regular security audits and penetration testing.

Message Queue Overload

Scenario: High volume of messages can overwhelm the message queue, leading to delays or lost messages.
Mitigation:
Use robust message queue systems (e.g., RabbitMQ, Kafka) that support high throughput.
Implement back-pressure mechanisms to handle overload gracefully.
Scale message queues horizontally to manage increased load.

Network Bandwidth Constraints

Scenario: Limited network bandwidth can affect video and audio quality, causing interruptions.
Mitigation:
Ensure sufficient bandwidth provisioning based on user capacity estimates.
Use CDN (Content Delivery Network) for distributing static content and offloading traffic.
Implement bandwidth throttling and QoS (Quality of Service) policies to prioritize video and audio traffic.

High Latency in Video/Audio Streams

Scenario: Increased latency can disrupt real-time communication, causing poor user experience.
Mitigation:
Use geographically distributed media servers to reduce latency.
Implement adaptive bitrate streaming to adjust video quality based on network conditions.
Optimize WebRTC configurations for low-latency performance.

Future improvements

Personal Meeting Room

Feature: Allow users to have a permanent, personalized meeting room URL.

Implementation:

User Service: Extend user profiles to include a personal meeting room URL.
Meeting Service: Modify the meeting scheduling logic to recognize and handle personal meeting rooms.
Database Changes: Add a column to the Users table for storing personal meeting room URLs.

Benefit: Provides users with a consistent and easy-to-remember meeting space for recurring meetings.

Instant Meeting

Feature: Allow users to start a meeting immediately without scheduling.

Implementation:

Meeting Service: Add an endpoint for creating instant meetings.
Notification Service: Optionally notify participants immediately via email or push notifications.

Benefit: Enables users to quickly initiate ad-hoc meetings, improving flexibility and responsiveness.

Live Audio Transcription

Feature: Provide real-time transcription of audio during meetings.

Implementation:

Audio Processing Service: Integrate a third-party transcription service (e.g., Google Cloud Speech-to-Text).
Web Client: Display live transcriptions to users.

Benefit: Enhances accessibility and allows participants to follow along with spoken content in real time.

AI Call Summary

Feature: Automatically generate a summary of the meeting after it concludes using AI.

Implementation:

Recording Service: Record the meeting audio.
AI Analysis Service: Analyze the recording post-meeting to generate a summary.
Notification Service: Send the summary to participants.

Benefit: Saves time for participants by providing concise meeting summaries, highlighting key points and actions.

Huge Meetings with Thousands of Participants

Feature: Support for large-scale meetings with thousands of participants.

Implementation:

Scalable Media Servers: Deploy additional media servers to handle large numbers of video streams.
CDN Integration: Use a Content Delivery Network to distribute the load and ensure low-latency streaming.
Optimized Broadcasting: Implement a broadcasting mechanism where a single video stream from the presenter is sent to all participants, reducing the load on media servers.

Benefit: Enables the platform to host webinars, town halls, and other large events effectively.