My Solution for Design Youtube or Netflix with Score: 8/10
by iridescent_luminous693
System requirements
Functional:
User Management:
- Users should be able to create accounts, log in, and manage profiles.
- Support different user roles: viewers, content creators, and administrators.
- Allow users to follow other users for updates on new content.
Video Upload:
- Users can upload videos with metadata such as title, description, tags, and category.
- Support multiple video formats and convert them into a standard format for playback.
- Provide a progress bar during uploads and notify users upon successful upload.
Video Playback:
- Enable seamless streaming of videos across different devices and internet speeds.
- Support video resolutions like 480p, 720p, 1080p, and 4K based on user preferences and bandwidth.
- Allow pausing, rewinding, forwarding, and autoplay of videos.
Search and Discovery:
- Implement a search feature to find videos by title, description, tags, or category.
- Provide personalized recommendations based on user history and preferences.
- Support trending and category-based browsing.
Engagement Features:
- Users can like, dislike, and comment on videos.
- Enable sharing of videos via links or social media platforms.
- Allow users to create and manage playlists.
Subscription and Notifications:
- Users can subscribe to channels to get notifications about new uploads.
- Notify users about trending videos, subscriptions, and updates.
Content Moderation:
- Administrators should have tools to review, approve, or remove inappropriate content.
- Allow users to report videos or comments for review.
Analytics:
- Provide content creators with analytics on video views, likes, comments, and audience demographics.
- Show trends and growth metrics for channels.
Monetization:
- Allow creators to monetize videos through ads or subscriptions.
- Manage ad placement and revenue distribution.
Video Storage and Management:
- Organize videos into categories and playlists.
- Enable creators to edit video details or delete uploads.
Non-Functional:
Scalability:
- The system should handle millions of users and videos, with high concurrency for uploads, searches, and streaming.
Performance:
- Ensure low latency for video playback with adaptive bitrate streaming.
- Search results should be delivered in under 1 second.
Availability:
- Maintain a highly available service with minimal downtime (e.g., 99.99% uptime SLA).
Reliability:
- Ensure data integrity for uploaded videos, user profiles, and engagement data.
- Use backups and redundant systems to prevent data loss.
Security:
- Protect user data and videos with encryption (e.g., HTTPS, secure storage).
- Implement secure authentication mechanisms, including 2FA.
Maintainability:
- Use a modular architecture to enable easy updates and feature enhancements.
- Ensure clear logging and monitoring for troubleshooting.
Accessibility:
- Support video playback and features across various devices (desktop, mobile, tablet).
- Provide options for subtitles, captions, and assistive technologies.
Compliance:
- Ensure adherence to copyright laws and regional regulations.
- Implement mechanisms to detect and handle copyright infringements.
Cost Efficiency:
- Optimize storage and streaming costs by using cloud solutions and content delivery networks (CDNs).
Localization:
- Support multiple languages for UI, subtitles, and metadata.
- Tailor recommendations based on regional preferences.
Capacity estimation
Assumptions:
- Total registered users: 100 million
- Active daily users: 10% (10 million users/day)
- Peak concurrency: 5% of daily active users (500,000 users at a time)
- Daily uploads: 500,000 videos/day
- Average video size: 500 MB
- Video duration: 10 minutes average
- Growth rate: 10% per year
- Retention period: 10 years
- Videos are stored in 3 resolutions (2.5x size overhead)
Storage Requirements:
- Daily video uploads: 500,000 videos/day × 500 MB = 250 TB/day
- Annual storage: 250 TB/day × 365 days = 91.25 PB/year
- Total storage for 10 years (with resolution overhead): 91.25 PB × 10 years × 2.5 = 2,281.25 PB (~2.28 Exabytes)
Bandwidth Requirements:
- Peak streaming bandwidth: 100 million concurrent viewers × 50 MB = 5 PB/hour
- Daily bandwidth for views: 1 billion views/day × 50 MB = 50 PB/day
- Bandwidth for uploads: 500,000 uploads/day × 500 MB = 250 TB/day
Request Throughput:
- Video upload requests: 500,000 uploads/day ÷ 24 hours ÷ 3600 seconds = ~6 uploads/second
- Video view requests: 1 billion views/day ÷ 24 hours ÷ 3600 seconds = ~11,574 views/second
- Search requests: Assume 10% of users perform searches daily (10 million), leading to 10 million searches/day ÷ 24 hours ÷ 3600 seconds = ~115 searches/second
Metadata Storage:
- Metadata per video: 1 KB average
- Daily metadata storage: 500,000 uploads/day × 1 KB = 500 MB/day
- Annual metadata storage: 500 MB/day × 365 days = 182.5 GB/year
- Metadata storage for 10 years: 182.5 GB/year × 10 = ~1.825 TB
API design
User Management APIs
- POST /api/users/register: Register a new user account.
- POST /api/users/login: Authenticate a user and generate an access token.
- PUT /api/users/{userId}: Update user details like username or profile picture.
Video Upload and Management APIs
- POST /api/videos/upload: Upload a new video with metadata (title, description, tags, etc.).
- PUT /api/videos/{videoId}: Update metadata of an existing video.
- DELETE /api/videos/{videoId}: Delete a video uploaded by the user.
Video Playback and Discovery APIs
- GET /api/videos/{videoId}: Fetch metadata and playback details for a video.
- GET /api/videos/search: Search for videos by title, tags, or category.
- GET /api/videos/{videoId}/stream: Stream video content for playback.
Engagement APIs
- POST /api/videos/{videoId}/like: Like a video.
- POST /api/videos/{videoId}/dislike: Dislike a video.
- POST /api/videos/{videoId}/comments: Add a comment to a video.
Analytics and Notifications APIs
- GET /api/analytics/videos/{videoId}: Fetch analytics data for a video (views, likes, comments, etc.).
- POST /api/channels/{channelId}/subscribe: Subscribe the user to a channel.
- GET /api/users/{userId}/notifications: Retrieve notifications for the user (e.g., new video uploads, comments).
Database design
1. Relational Database
Used for structured data like users, videos, and relationships (e.g., likes, subscriptions).
Key Tables
- Users Table
user_id
: Primary Keyusername
email
password_hash
profile_picture
created_at
- Videos Table
video_id
: Primary Keytitle
description
uploader_id
: Foreign Key (users.user_id)category
tags
uploaded_at
- Likes Table
like_id
: Primary Keyuser_id
: Foreign Key (users.user_id)video_id
: Foreign Key (videos.video_id)is_like
: Boolean
- Comments Table
comment_id
: Primary Keyvideo_id
: Foreign Key (videos.video_id)user_id
: Foreign Key (users.user_id)comment_text
created_at
- Subscriptions Table
subscription_id
: Primary Keysubscriber_id
: Foreign Key (users.user_id)channel_id
: Foreign Key (users.user_id)
2. NoSQL Database
Used for unstructured data like metadata, analytics, and search indices.
Key Collections
- Video Metadata
video_id
views
likes_count
dislikes_count
comments_count
average_watch_time
- Search Index
video_id
title
tags
category
3. Blob Storage
Used for storing large binary objects such as video files and thumbnails.
Key Buckets
- Videos
- Directory structure:
/videos/{video_id}/{resolution}.mp4
- Files for each resolution: 480p, 720p, 1080p, etc.
- Directory structure:
- Thumbnails
- Directory structure:
/thumbnails/{video_id}.jpg
- Directory structure:
4. Cache
Used for frequently accessed data to reduce database load and improve response times.
Examples
- Cached Video Metadata
video_id
views
likes
uploader_username
- Cached Search Results
- Query results for popular searches.
High-level design
User and Client Application
- User:
- Represents the end-user who interacts with the system.
- Users perform actions like uploading videos, watching content, searching for videos, and engaging (liking, commenting, etc.).
- Client Application:
- The interface through which users interact with the system (e.g., web application, mobile app, or desktop application).
- Sends user requests to the backend for processing.
Load Balancer
- Load Balancer:
- Distributes incoming requests evenly across the Backend Gateway instances to ensure high availability and fault tolerance.
- Handles traffic spikes efficiently by scaling backend services horizontally.
Backend Gateway
- Backend Gateway:
- Serves as the entry point for all user requests.
- Routes requests to the appropriate backend services based on their type (e.g., user login, video upload, video streaming).
Backend Services
- User Service:
- Manages user-related operations such as registration, authentication, and profile updates.
- Interacts with the Relational Database containing the
Users Table
.
- Video Upload Service:
- Handles the uploading of videos and metadata (e.g., title, description, tags, category).
- Stores video files in Blob Storage and metadata in the
Videos Metadata
collection.
- Video Playback Service:
- Retrieves video files from Blob Storage and streams them to users.
- Accesses thumbnails for video previews.
- Search Service:
- Processes user search queries to find videos by title, tags, or category.
- Retrieves results from the NoSQL Database, which contains the
Search Index Collection
.
- Engagement Service:
- Handles user interactions such as liking/disliking videos, adding comments, and managing subscriptions.
- Updates engagement-related tables like
Likes Table
,Comments Table
, andSubscriptions Table
in the Relational Database.
- Analytics Service:
- Collects and processes video analytics data such as views, average watch time, and engagement metrics.
- Stores processed data in the
Video Metadata Collection
andAnalytics Data Collection
in the NoSQL Database.
Databases and Storage
- Relational Database:
- Supports structured data for services like user management and engagement.
- Tables:
Users Table
: Stores user profile data.Likes Table
: Tracks likes and dislikes for videos.Comments Table
: Stores user comments on videos.Subscriptions Table
: Tracks which users have subscribed to which channels.
- NoSQL Database:
- Stores unstructured data, optimized for high read and write throughput.
- Collections:
Search Index Collection
: Indexes video data for fast search operations.Video Metadata Collection
: Tracks video performance metrics.Analytics Data Collection
: Stores aggregated engagement and usage data.
- Blob Storage:
- Handles large binary files such as video content and thumbnails.
- Directories:
Video Files
: Stores videos in multiple resolutions (e.g., 480p, 720p, 1080p).Thumbnails
: Stores preview images for videos.
Data Flow
- User Request:
- User sends a request (e.g., upload a video, watch a video, perform a search) through the client application.
- Load Balancer:
- Routes the request to an available instance of the Backend Gateway.
- Backend Gateway:
- Forwards the request to the appropriate backend service:
- User requests (e.g., login) go to the User Service.
- Video uploads are sent to the Video Upload Service.
- Playback requests are handled by the Video Playback Service.
- Search queries are processed by the Search Service.
- Engagement actions (e.g., likes, comments) are managed by the Engagement Service.
- Forwards the request to the appropriate backend service:
- Backend Services:
- Process the request and interact with the necessary storage components (Relational Database, NoSQL Database, or Blob Storage).
- Storage Access:
- The services read/write data to their respective tables or collections to fulfill the request.
- Response:
- The processed result (e.g., search results, video stream) is sent back to the user via the client application.
Request flows
1. User Registration:
- User initiates the register process via the Client Application.
- The Client Application sends the registration details (e.g., username, password, email) to the Backend Gateway.
- The Backend Gateway forwards the request to the User Service.
- The User Service validates the data and stores the new user in the Relational Database (e.g.,
Users Table
). - After the user data is successfully stored, the User Service responds to the Backend Gateway.
- The Backend Gateway sends a success response back to the Client Application, which shows a confirmation message to the User.
2. Video Upload:
- User uploads a video through the Client Application.
- The Client Application sends the video file and metadata (title, description, tags) to the Backend Gateway.
- The Backend Gateway routes the request to the Video Upload Service.
- The Video Upload Service saves the video file in Blob Storage and the metadata in the NoSQL Database.
- Once both the video file and metadata are successfully stored, the Video Upload Service informs the Backend Gateway of the successful upload.
- The Backend Gateway sends the confirmation of the upload to the Client Application, which displays a success message to the User.
3. Video Playback:
- User requests to play a video via the Client Application.
- The Client Application sends the video playback request to the Backend Gateway.
- The Backend Gateway forwards the request to the Video Playback Service.
- The Video Playback Service retrieves the video file from Blob Storage.
- The video file is streamed back to the Client Application in chunks, providing continuous playback to the User.
4. Video Search:
- User searches for videos via the Client Application.
- The Client Application sends the search query to the Backend Gateway.
- The Backend Gateway forwards the query to the Search Service.
- The Search Service queries the NoSQL Database's search index to find relevant videos based on the query (title, tags, etc.).
- The Search Service retrieves the search results from the NoSQL Database and sends them back to the Backend Gateway.
- The Backend Gateway returns the results to the Client Application, which displays the list of videos to the User.
5. Liking a Video:
- User likes a video via the Client Application.
- The Client Application sends the like request to the Backend Gateway.
- The Backend Gateway forwards the like request to the Engagement Service.
- The Engagement Service updates the Likes Table in the Relational Database to track the like.
- Once the like is recorded, the Engagement Service informs the Backend Gateway, which confirms the like action to the Client Application.
- The Client Application shows the like success message to the User.
6. Fetching Analytics:
- User requests to view analytics for a video via the Client Application.
- The Client Application sends the request for video analytics to the Backend Gateway.
- The Backend Gateway forwards the request to the Analytics Service.
- The Analytics Service retrieves video-related data (e.g., views, engagement) from the NoSQL Database.
- The Analytics Service processes the data and returns the analytics to the Backend Gateway.
- The Backend Gateway sends the analytics data to the Client Application, which then displays it to the User.
Detailed component design
1. User Service
Role:
Handles all user-related operations, including authentication, registration, and profile management.
Responsibilities:
- User Registration:
- Validates input data (username, email, password) during registration.
- Stores user details in the Relational Database (e.g.,
Users Table
).
- User Authentication:
- Verifies user credentials during login.
- Generates and validates JWT tokens for secure communication.
- Profile Management:
- Allows users to update profile details, such as username, profile picture, or email.
Interactions:
- Relational Database:
- Stores user data in the
Users Table
.
- Stores user data in the
- Backend Gateway:
- Communicates with the gateway to process user requests.
- Client Application:
- Sends responses to the client for user-facing operations.
2. Video Upload Service
Role:
Handles the uploading of video files and their associated metadata.
Responsibilities:
- Video File Handling:
- Processes video uploads and stores them in Blob Storage.
- Ensures videos are converted to multiple resolutions (e.g., 480p, 720p, 1080p).
- Metadata Management:
- Saves video metadata (title, description, tags, category) in the NoSQL Database.
Interactions:
- Blob Storage:
- Stores the uploaded video files and thumbnails.
- NoSQL Database:
- Stores metadata such as video title, description, and tags.
- Backend Gateway:
- Receives upload requests and sends responses to the client.
3. Video Playback Service
Role:
Streams video content to users and manages playback functionality.
Responsibilities:
- Video Streaming:
- Retrieves video files from Blob Storage.
- Streams video content in chunks to ensure smooth playback.
- Resolution Handling:
- Dynamically adjusts the resolution based on user preferences and network conditions.
Interactions:
- Blob Storage:
- Fetches video files for playback.
- Backend Gateway:
- Receives playback requests and streams data to the client application.
- Client Application:
- Displays video playback to the user.
4. Search Service
Role:
Processes search queries and retrieves relevant video results.
Responsibilities:
- Search Query Processing:
- Accepts user search terms (e.g., keywords, tags, categories).
- Matches queries against indexed video metadata.
- Search Index Management:
- Maintains an up-to-date index of videos for quick lookup.
Interactions:
- NoSQL Database:
- Queries the
Search Index Collection
for video search results.
- Queries the
- Backend Gateway:
- Receives search requests and returns results to the client application.
5. Engagement Service
Role:
Manages user interactions with videos, such as likes, dislikes, comments, and subscriptions.
Responsibilities:
- Likes and Dislikes:
- Tracks user likes/dislikes for videos in the Relational Database (
Likes Table
).
- Tracks user likes/dislikes for videos in the Relational Database (
- Comments:
- Allows users to add, update, or delete comments on videos, stored in the
Comments Table
.
- Allows users to add, update, or delete comments on videos, stored in the
- Subscriptions:
- Tracks user subscriptions to channels in the
Subscriptions Table
.
- Tracks user subscriptions to channels in the
Interactions:
- Relational Database:
- Updates and retrieves data from the
Likes Table
,Comments Table
, andSubscriptions Table
.
- Updates and retrieves data from the
- Backend Gateway:
- Receives interaction requests and responds to the client application.
6. Analytics Service
Role:
Tracks and analyzes video performance metrics, such as views, engagement, and average watch time.
Responsibilities:
- View Count Management:
- Tracks video views and stores aggregated data in the NoSQL Database (
Video Metadata Collection
).
- Tracks video views and stores aggregated data in the NoSQL Database (
- Engagement Analytics:
- Aggregates data such as likes, comments, and watch duration for content creators.
- Trend Analysis:
- Provides insights into trending videos based on engagement and view counts.
Interactions:
- NoSQL Database:
- Stores analytics data in
Video Metadata Collection
andAnalytics Data Collection
.
- Stores analytics data in
- Backend Gateway:
- Receives analytics requests and sends results to the client application.
Data Storage Components
- Relational Database:
- Stores structured data for services like user management and engagement.
- Tables:
Users Table
Likes Table
Comments Table
Subscriptions Table
- NoSQL Database:
- Optimized for high-performance operations like search and analytics.
- Collections:
Search Index Collection
Video Metadata Collection
Analytics Data Collection
- Blob Storage:
- Stores large video files and thumbnails, ensuring scalability and cost-efficiency.
Trade offs/Tech choices
1. Relational Database for User & Engagement Data
- Choice: MySQL/PostgreSQL for structured data like users, likes, comments.
- Why: ACID compliance ensures consistency for critical interactions.
- Trade-off: Requires sharding/replication for scalability, adding complexity.
2. NoSQL for Metadata, Search & Analytics
- Choice: MongoDB/Elasticsearch for high throughput and flexible schema.
- Why: Scalable for large unstructured data like video metadata and search.
- Trade-off: Eventual consistency may delay updated results.
3. Blob Storage for Videos
- Choice: AWS S3 for storing video files and thumbnails.
- Why: Scalable, cost-efficient, and reliable for large static files.
- Trade-off: Higher latency for retrieving video files.
4. Microservices Architecture
- Choice: Independent services for modularity and scalability.
- Why: Easier to scale and maintain individual services.
- Trade-off: Increased complexity in debugging and service communication.
5. Load Balancer and Gateway
- Choice: Load Balancer distributes traffic; Gateway routes requests.
- Why: Ensures fault tolerance, centralizes authentication.
- Trade-off: Gateway can be a single point of failure if not replicated.
6. Event-Driven for Asynchronous Tasks
- Choice: Kafka/SQS for video transcoding and analytics.
- Why: Decouples heavy tasks for better responsiveness.
- Trade-off: Slight delays in completing tasks like analytics updates.
7. Adaptive Bitrate Streaming
- Choice: Serve videos in multiple resolutions based on bandwidth.
- Why: Ensures smooth playback across varying network conditions.
- Trade-off: Increased storage for multiple video resolutions.
8. Caching for Popular Data
- Choice: Redis/Memcached for trending videos, search results.
- Why: Reduces load on databases, speeds up responses.
- Trade-off: Cache invalidation complexity to ensure freshness.
Failure scenarios/bottlenecks
User Service Failures:
- Registration/Login Fails: Database downtime or connection issues.
- Authentication Issues: JWT token errors cause unauthorized access.
- Mitigation: Use replicas, cache sessions, fallback authentication.
Video Upload Bottlenecks:
- High Upload Traffic: Causes slow uploads or timeouts.
- Transcoding Overload: Delays video availability.
- Mitigation: Scale storage, use message queues, rate-limit uploads.
Playback Issues:
- High Concurrent Viewers: Viral videos increase latency.
- Storage Latency: Slow access to video files affects streaming.
- Mitigation: Use CDNs, preload popular videos, adaptive streaming.
Search Failures:
- Slow Queries: High traffic or large datasets.
- Indexing Delays: New videos missing in search.
- Mitigation: Cache results, scale NoSQL, prioritize indexing.
Engagement Service Bottlenecks:
- Write Heavy Load: Viral videos cause contention.
- Data Integrity Issues: Concurrent updates cause inconsistencies.
- Mitigation: Partition data, use locks, cache metrics.
Analytics Delays:
- Aggregation Slowness: Delayed trending metrics.
- Data Loss in Queues: Overflow or unprocessed events.
- Mitigation: Scale pipelines, distributed processing, dead-letter queues.
Load Balancer Issues:
- Traffic Overload: Causes high latency or dropped requests.
- Single Point of Failure: Misconfigured or down balancer.
- Mitigation: Use multi-region load balancers, auto-scaling.
General Failures:
- Service Outages: One service affects dependent services.
- Mitigation: Use circuit breakers, retries, graceful degradation.
Future improvements
Scalability:
- Auto-scale services and storage (e.g., Kubernetes, cloud scaling).
- Shard relational databases by video/user ID for reduced contention.
- Scale NoSQL clusters to handle spikes.
Video Delivery:
- Use CDNs to cache popular videos and reduce storage load.
- Prewarm cache for trending videos in high-demand regions.
Search Optimization:
- Incremental indexing for new uploads.
- Cache frequent search queries for fast responses.
Resilience:
- Add circuit breakers and rate limiting for service failures.
- Deploy services across multiple regions with active-active setups.
Analytics:
- Use real-time frameworks like Apache Flink for instant insights.
- Pre-aggregate metrics (e.g., views, likes) to speed up queries.
Engagement Service:
- Create clusters for likes, comments, and subscriptions.
- Allow eventual consistency for metrics like like counts.
Service Availability:
- Add multi-region load balancers for failover.
- Use service discovery tools for automatic rerouting.
Data Storage:
- Move old videos to cold storage (e.g., AWS Glacier).
- Use delta updates for metadata changes.
Advanced Features:
- Add AI-based recommendations for personalized content.
- Preload videos based on viewing patterns.