System requirements


Functional:

  1. Task creation - users should be able to create task specifying the task name, execution time, and recurrence interval is needed. There could be some kind of user interface such as web interface.
  2. Task scheduling - the system should efficiently schedule tasks for execution based on the specified time and recurrence.
  3. Task execution - tasks should be executed accurately at the scheduled time.
  4. Task monitoring - users should be able to monitor the status of tasks, whether pending, completed, or failed.
  5. Task rescheduling - users should have the option to reschedule or cancel tasks that are already scheduled.
  6. Concurrency handling - the system should handle multiple tasks running concurrently without conflicts.
  7. Error handling - the system should have robust error handling mechanisms to deal with failures during task execution.
  8. Task persistence - task data should be stored persistently to ensure that scheduled tasks are not lost in case of system failure.
  9. Scalability - the system should be able to handle a large number of tasks efficiently without significant delay


Non-Functional:

  1. Performance - The system should have low latency and be able to handle large number of scheduled tasks efficiently. Let's say the system needs to support scheduling 10,000 tasks per minute with a maximum delay of 1 second.
  2. Reliability - The system should be highly reliable, ensuring that scheduled tasks are executed as expected without failures.
  3. Scalability - The system should be designed to scale horizontally to accommodate an increasing number of tasks over time. It should be able to scale to support up to 100,000 tasks per minute.
  4. Security - The system should have robust security measures in place to protect task data and prevent unauthorized access.
  5. Monitoring - The system should provide monitoring capabilities to track task execution, system performance, and resource utilization.
  6. Audit-ability - The system should have logging and auditing mechanisms to track task scheduling, execution and any system events.



Capacity estimation

Let's consider the following estimates for capacity and bandwidth:


  • Task Creation Frequency: Let's assume an average of 100 tasks are created per second.
  • Task Execution Frequency: Assuming an average of 80 tasks are executed per second.
  • Task Data Size: Let's estimate each task data to be around 1 KB in size.
  • Bandwidth: Assuming an average bandwidth consumption of 1 MB/s for task creation and execution.


Based on these estimates, we can calculate the required capacity and bandwidth for the Task Scheduler system:


  1. Capacity Estimation:
  2. Task Creation Capacity: 100 tasks/second * 60 seconds = 6000 tasks/minute
  3. Task Execution Capacity: 80 tasks/second * 60 seconds = 4800 tasks/minute
  4. Bandwidth Estimation:
  5. Task Creation Bandwidth: 1 KB/task * 100 tasks/second = 100 KB/s = 0.1 MB/s
  6. Task Execution Bandwidth: 1 KB/task * 80 tasks/second = 80 KB/s = 0.08 MB/s


Considering each task data size as 1 KB and the creation of 6000 tasks per minute, the system will need a database capable of storing and managing this data efficiently. Therefore, the database should be able to handle a large volume of data insertion and retrieval operations.



API design

createTask - creates a task

input: task name, code function to be executed (this could be literal code passed in or reference to some file with entry point)

output: creation success or failure, 201 https code response.


scheduleTask - schedule the task for a specify time to run

input: date time to execute the task, name of the task to be executed

output: schedule success or failure, 201 https code response.


executeTask - immediately executes the task by queuing it up.

input: task name

output: job id


rescheduleTask - reschedules the task for a different time to run.

input: job id

output: job id


listTasks - list tasks

input: none

output: list of tasks


updateTask - updates a task

input: task name, code function or reference to code file

output: update success or failure


deleteTask - deletes a task

input: task id

output: success or failure


Database design


The database will Stores task metadata and execution logs.


task metadata table

  • task name
  • task Id
  • execution time
  • code


execution logs table

  • task name
  • log
  • created


task status

  • pending, in progress, or completed
  • task id
  • updated


recurring tasks

  • task id
  • interval
  • start date
  • end date



High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...


database

monitoring service

notification service

task scheduler service





Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...






Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?