Codemia | Master System Design Interviews Through Active Practice

My Solution for Design a Task Management Application

by nectar4678

System requirements

Functional:

Functional requirements define what the system should do. For a task management application, common functional requirements include:

Task Creation: Ability to create tasks with attributes such as title, description, due date, priority, and category.
Task Assignment: Functionality to assign tasks to individuals or teams.
Progress Monitoring: Features to update and track the status of tasks (e.g., not started, in progress, completed).
Notifications: Automatic reminders for upcoming deadlines and changes in task status.
Collaboration Tools: Capabilities like commenting on tasks, sharing files, and tagging team members.
Task Categorization: Organize tasks into categories or projects to enhance findability.
Search and Filter: Options to search for tasks and filter them by different criteria like due date, priority, or person responsible.

Non-Functional:

Non-functional requirements specify how the system should perform and operate. These often include:

Usability: The application should be intuitive and easy to use, with a clean, responsive user interface that adapts to different device types (mobile, desktop).
Scalability: Capable of scaling to handle a large number of users and tasks without degradation in performance.
Security: Robust authentication and authorization to ensure data protection and privacy, especially in team environments.
Reliability: High availability and minimal downtime, with data backup mechanisms in place.
Performance: Fast response times for user interactions, even under load.
Interoperability: Ability to integrate easily with other tools and platforms, such as calendars or email services.

Capacity estimation

Assumptions for Capacity Estimation

Here are typical assumptions we might consider for this application:

User Base: Assume an initial user base of 100,000 users.
Daily Active Users (DAUs): Assume 20% of the total user base, i.e., 20,000 DAUs.
Task Operations: Each active user performs about 10 task-related operations per day (creating, updating, or querying tasks).
Data Storage per Task: Assume each task, on average, requires about 1 KB of storage, including metadata for assignment, status, comments, and history.
Concurrent Users: At peak times, assume up to 5% of DAUs are concurrently active, which is 1,000 users.

Capacity Calculations

Based on these assumptions, we can estimate the following:

Transactions Per Day:

20,000 DAUs×10 operations/user/day=200,000 operations/day
20,000 DAUs×10 operations/user/day=200,000 operations/day

Storage Requirements:

If each user creates an average of 5 tasks per day (part of their 10 operations):

20,000 DAUs×5 tasks/user/day×1 KB/task=100,000 KB/day or ≈100 MB/day
20,000 DAUs×5 tasks/user/day×1 KB/task=100,000 KB/day or ≈100 MB/day

For one year:

100 MB/day×365 days/year=36,500 MB/year or ≈36.5 GB/year
100 MB/day×365 days/year=36,500 MB/year or ≈36.5 GB/year

Bandwidth Estimation:

If each operation results in about 2 KB of data transferred (considering request and response together):

Daily data transfer:

200,000 operations/day×2 KB/operation=400,000 KB/day or ≈400 MB/day
200,000 operations/day×2 KB/operation=400,000 KB/day or ≈400 MB/day

Concurrency and Load Handling:

Peak concurrency estimate: 1,000 concurrent users, potentially generating 10,000 operations in peak minutes.

With these estimates, the system design needs to ensure that the database and application servers can handle the storage and transaction load, and that the network infrastructure can support the estimated bandwidth.

API design

Create Task

Endpoint: POST /tasks
Description: Creates a new task.
Request Body:
{
  "title": "Design Database Schema",
  "description": "Create an initial database schema for the task management app",
  "due_date": "2024-06-01T12:00:00Z",
  "priority": "High",
  "category": "Development",
  "assignee_id": 12345
}
Response Body:
{
  "status": "success",
  "task_id": 1001,
  "message": "Task created successfully."
}

Update Task

Endpoint: PUT /tasks/{task_id}
Description: Updates an existing task.
Request Body:
{
  "title": "Update Database Schema",
  "description": "Refine the initial database schema based on feedback",
  "due_date": "2024-06-02T17:00:00Z",
  "priority": "Medium",
  "category": "Development",
  "assignee_id": 12346
}
Response Body:
{
  "status": "success",
  "task_id": 1001,
  "message": "Task updated successfully."
}

Get Task

Endpoint: GET /tasks/{task_id}
Description: Retrieves details about a specific task.
Response:
{
  "task_id": 1001,
  "title": "Update Database Schema",
  "description": "Refine the initial database schema based on feedback",
  "due_date": "2024-06-02T17:00:00Z",
  "priority": "Medium",
  "category": "Development",
  "assignee_id": 12346,
  "status": "In Progress"
}

Delete Task

Endpoint: DELETE /tasks/{task_id}
Description: Deletes a specific task.
Response:
{
  "status": "success",
  "task_id": 1001,
  "message": "Task deleted successfully."
}

List Task

Endpoint: GET /tasks
Description: Lists all tasks with filtering options.
Query Parameters: category, priority, assignee_id, status
Response:
{
  "tasks": [
    {
      "task_id": 1001,
      "title": "Update Database Schema",
      "description": "Refine the initial database schema based on feedback",
      "due_date": "2024-06-02T17:00:00Z",
      "priority": "Medium",
      "category": "Development",
      "assignee_id": 12346,
      "status": "In Progress"
    },
    {
      "task_id": 1002,
      "title": "Initial Draft for UI/UX",
      "description": "Sketch the initial UI/UX frames for the app",
      "due_date": "2024-05-25T15:00:00Z",
      "priority": "High",
      "category": "Design",
      "assignee_id": 12347,
      "status": "Not Started"
    }
  ]
}

Database design

Entities and Attributes

User

user_id (PK): Unique identifier for the user.
username: User's chosen username.
email: User's email address.
password_hash: Hashed password for security.
created_at: Timestamp of user account creation.

Task

task_id (PK): Unique identifier for the task.
title: Title of the task.
description: Detailed description of the task.
due_date: Deadline for task completion.
priority: Priority level (e.g., High, Medium, Low).
status: Current status (e.g., Not Started, In Progress, Completed).
category: Category to help organize tasks (e.g., Development, Design).
creator_id (FK): User ID of the task creator.
assignee_id (FK): User ID of the task assignee.

Comment

comment_id (PK): Unique identifier for the comment.
task_id (FK): Associated task's ID.
user_id (FK): ID of the user who made the comment.
text: Text content of the comment.
created_at: Timestamp when the comment was made.

Category

category_id (PK): Unique identifier for the category.
name: Name of the category.
description: Description of what the category is used for.

Relationships

Users to Tasks (Creator and Assignee): A user can create multiple tasks (one-to-many), and a task can be assigned to a user (many-to-one).
Tasks to Comments: A task can have multiple comments (one-to-many), and a comment is associated with one task.
Users to Comments: A user can make multiple comments (one-to-many), and each comment is made by one user.
Tasks to Categories: Each task can be associated with one category, while a category can encompass multiple tasks.

High-level design

Moving on to the high-level design, we will define the key components required for our task management application. This will include a breakdown of the architecture and the role of each component within the system. We'll also provide a block diagram to visually represent these relationships.

Key Components of the Task Management Application

Web Frontend

Purpose: Provides the user interface for interacting with the application. Built using modern web technologies such as React or Angular for a responsive and interactive experience.
Responsibilities: Handling user inputs, rendering tasks, and providing visual feedback based on user interactions.

Mobile Frontend

Purpose: Delivers a native or hybrid mobile application interface, allowing access from smartphones and tablets.
Responsibilities: Similar to the web frontend but optimized for mobile experiences, including touch interactions and mobile-specific functionalities.

API Gateway

Purpose: Serves as the single entry point for all client requests, directing them to the appropriate backend services. It helps in request routing, composition, and protocol translation.
Responsibilities: Authentication, authorization, request routing, load balancing, and API metering.

Task Management Service

Purpose: Core backend service responsible for managing all task-related operations.
Responsibilities: Task creation, modification, deletion, and querying tasks based on various criteria.

User Management Service

Purpose: Manages user profiles and authentication details.
Responsibilities: User registration, login, profile updates, and user authentication.

Notification Service

Purpose: Handles sending notifications to users about task updates, due dates, and other alerts.
Responsibilities: Integrating with email services, push notification services, and possibly SMS for sending timely updates.

Database

Purpose: Central storage for all application data, including tasks, users, comments, and categories.
Responsibilities: Data persistence, ensuring data integrity, and supporting high-volume data access efficiently.

Search and Reporting Service

Purpose: Provides advanced search capabilities across tasks and generates reports on user activity and task statuses.
Responsibilities: Indexing task data, supporting complex search queries, and generating performance reports.

Request flows

Request Flow: Creating a Task

User Interaction: A user logs into the mobile or web frontend and decides to create a new task.

Send Request to API Gateway: The frontend constructs an HTTP POST request with the task details in the JSON format and sends it to the API Gateway.

API Gateway Processing:

Authentication: The API Gateway checks the user's authentication token to verify if the user is logged in and authorized to perform the action.
Routing: Once authenticated, the request is routed to the Task Management Service.

Task Management Service Processing:

Validation: The Task Management Service validates the request data (e.g., checks if all required fields are present and correctly formatted).
Persistence: If the validation is successful, the service interacts with the Database to store the task details.
Confirmation: Once the task is successfully saved, the service sends a confirmation response back to the API Gateway.

Notification Service Trigger:

Event Subscription: The Task Management Service publishes an event (task creation) which the Notification Service is subscribed to.
Notification Dispatch: The Notification Service processes the event and sends an email or push notification to the user (and potentially to the assigned team members), confirming the creation of the task.

Response to User:

API Gateway: Receives the success confirmation from the Task Management Service and forwards it to the appropriate frontend.
User Notification: The frontend receives the confirmation response and displays a success message to the user. Simultaneously, the user receives a notification via email or push notification.

Detailed component design

1. Task Management Service

Scalability

Horizontal Scaling: The service can be scaled horizontally by adding more instances behind a load balancer to handle increased load, ensuring that multiple instances can manage read and write operations efficiently.
Caching: Implement caching strategies (e.g., Redis) for frequently accessed data like task lists and task details to reduce database load and improve response times.

Data Structures

Trees for Hierarchical Tasks: Use tree data structures to manage tasks that have sub-tasks, providing an efficient way to traverse and manage hierarchical relationships.
Hash Tables for Quick Lookup: Utilize hash tables to store task metadata for O(1) average-time complexity on lookup operations, which is beneficial for operations like searching tasks by ID.

Algorithms

Sorting Algorithms: Implement sorting algorithms for organizing tasks by priority, due date, or custom user-defined filters. Quick sort or merge sort can be used for backend sorting due to their efficiency in handling large data sets.
Graph Algorithms for Dependency Resolution: If tasks have dependencies (e.g., Task B cannot start until Task A is completed), graph algorithms like topological sorting can be employed to determine the order of task completions.

2. Notification Service

Scalability

Queue-Based Architecture: Implement a message queue (e.g., RabbitMQ, Kafka) for handling notification requests, which allows the service to buffer incoming notifications during peak times and process them asynchronously.
Microservice Design: Treat the notification functionality as a separate microservice, which can be independently scaled based on demand.

Algorithms

Batch Processing: Employ batch processing algorithms to aggregate multiple notifications into a single message where appropriate (e.g., daily summaries), reducing the number of messages sent and optimizing resource usage.
Priority Queue for Critical Alerts: Use a priority queue to manage and prioritize sending out notifications based on urgency, ensuring that critical alerts are sent out before others.

Data Structures

Linked Lists for Message Queueing: Use linked lists for implementing message queues in custom queue management solutions, providing efficient queue operations.
Hash Maps for Subscriber Lists: Maintain hash maps for storing subscriber preferences and details, enabling quick access and updates.

Potential for Expansion and Improvement

Task Management Service: Implement machine learning algorithms to predict task completion times based on historical data, improving task planning and resource allocation.
Notification Service: Expand the service to include more complex event processing capabilities for personalized and context-aware notifications.

Trade offs/Tech choices

1. Microservices Architecture vs. Monolithic Architecture

Choice: Microservices Architecture

Reasons:

Scalability: Microservices allow specific components of the application, such as the Task Management Service and Notification Service, to scale independently based on demand.
Flexibility in Development: Different teams can work on different services using the best technology stack for their specific needs.
Trade-offs:
Complexity: Increased complexity in deployment and inter-service communication.
Overhead: Additional overhead for handling distributed data management and API gateway setup.

2. Using a Message Queue for Notifications

Choice: Implement a message queue (e.g., RabbitMQ, Apache Kafka) for the Notification Service.

Reasons:

Reliability: Ensures that notification messages are not lost during system failures.
Asynchronous Processing: Helps in decoupling the service that generates the event from the notification dispatch mechanism.
Trade-offs:
Latency: Introducing a message queue can add a slight delay in notification delivery compared to direct sending mechanisms.
Maintenance: Requires additional setup and maintenance efforts.

3. Database Selection (SQL vs. NoSQL)

Choice: SQL Database (e.g., PostgreSQL)

Reasons:

ACID Compliance: Ensures reliable and consistent transactions which are crucial for task and user management.
Schema Enforcement: Helps in maintaining data integrity as the application scales.
Trade-offs:
Scalability Concerns: SQL databases typically have more scalability limits compared to NoSQL databases.
Flexibility: Less flexible in accommodating unstructured data or rapidly changing schemas.

Failure scenarios/bottlenecks

1. Database Downtime

Impact: Loss of access to task, user, and comment data, which would halt most of the application operations.
Mitigation:
High Availability Setup: Use a primary-secondary database replication setup to ensure that if the primary database fails, the secondary can take over with minimal downtime.
Regular Backups: Implement automated backups and test restore procedures regularly.

2. API Gateway Failure

Impact: A single point of failure that could prevent all client requests from reaching backend services.
Mitigation:
Redundancy: Deploy multiple instances of the API Gateway across different physical servers or cloud regions.
Health Checks and Auto-Scaling: Utilize cloud provider services for health checks and auto-scaling to replace or scale instances automatically in case of failure.

3. Service Overload

Impact: Task or Notification Service could become overwhelmed by high request volumes, leading to slow response times or timeouts.
Mitigation:
Load Balancing: Distribute incoming traffic evenly across multiple service instances.
Rate Limiting: Implement rate limiting to prevent any single user or service from consuming too many resources.
Circuit Breaker Pattern: Use the circuit breaker pattern to temporarily halt operations to a particular service when failures reach a threshold, preventing system collapse.

4. Network Latency and Outages

Impact: High network latency or outages can severely impact the performance and user experience, especially for cloud-hosted components.
Mitigation:
Content Delivery Network (CDN): Use a CDN to serve static content closer to users to reduce latency.
Network Performance Monitoring: Implement monitoring tools to proactively detect and address network issues.

5. Concurrency Issues

Impact: Issues such as race conditions or deadlocks could occur in the Task Management Service when multiple users attempt to modify a task simultaneously.
Mitigation:
Optimistic Locking: Use optimistic locking where each transaction checks if another has modified the data before it completes its operation.
Transaction Management: Ensure robust transaction management in the database to handle deadlocks gracefully.

Future improvements

1. Artificial Intelligence and Machine Learning Integration

Purpose: Integrate AI and ML to enhance various aspects of the application, such as task prioritization, user behavior prediction, and automated task categorization.
Benefits: Improve user productivity with personalized suggestions and automate routine operations.
Implementation: Begin with simple models for predictive analytics based on user interaction data and evolve to more complex algorithms for behavior prediction and task management optimization.

2. Advanced Search Capabilities

Purpose: Implement advanced search features that allow users to query tasks using natural language processing (NLP).
Benefits: Make the search function more intuitive and robust, improving user experience.
Implementation: Use NLP libraries to parse user queries and convert them into database queries, potentially integrating with AI services like Elasticsearch for better indexing and search capabilities.