System requirements
Functional:
MVP*
Users should be able to submit tasks
Users should be able to query the status of their tasks
tasks scheduling can be one-time off or recurring
Others*
We can assign priority to each task. High priority task should be processed with higher priority
Non-Functional:
Durable, once a job is submitted, our system ensures the task will never get lost
Reliable, the job must be executed at least once
Scalable, we should be able to scale up our system
Available, our system should be highly available, so users can submit tasks at any time.
For ex. 99.99% Availability (1 min downtime per week)
Capacity estimation
1 million DAU, each user submits 5 tasks per day
5 000 000 tasks / day,
86400 sec => ~ 1000 00 sec / day
QPS => 50 tasks / sec, peak QPS -> 5 * QPS => 250 peak QPS
Each task, 1kb record per task,
data size: 5 GB / day
API design
submit_task(user_id, schedule, code_location). Schedule can use cron schedule
query_task_status(task_id)
list_user_tasks(user_id)
delete_task(task_id)
Database design
task_metadata_table
task_id, user_id, creation_timestamp, schedule, code_location
partitioned by user_id
task_execution_table
execution_id, task_id, next_scheduled_runtime, creation_timestamp, last_update_timestamp, status, worker_id.
partitioned by next_scheduled_runtime
status can be "NOT STARTED", "ENQUEUED", "STARTED", "SUCCESS", "FAILED"
We can use NoSQL DB for better scaling
How to use these tables
Our system will use the task_execution_table to get the list of jobs that should be executed now, it can send a query like
select task_id from task_execution_table where next_scheduled_runtime > TIMESTAMP_SUB(current_timestamp(), INTERVAL 5 MIN) and status = "NOT STARTED"
list_user_tasks API can query task_metadata_table to get user_tasks relationships
submit_task will create entries in task_metadata_table and task_execution_table
query_task_status API can query task_execution_table to get the execution status about the job
delete_task API will update the status column execution_table and put a CANCELED status. But this can only be done if the job has not been executed yet.
High-level design
User uses web client to call submit_task API that submit tasks to our system.
We will store the task metadata and execution metadata in the DB
We have a scheduling service that regularly scan the DB and fetch the list of jobs that need to be executed
The scheduler in scheduling service will assign query ranges (bucket_ids) to workers, so each worker knows how to query the task execution table. (basically search those tasks with "NOT_STARTED" status, fall in the search range and bucket)
The worker will put the task into a queue, and update the task status to "ENQUEUED"
On the other side, we have execution service that polls messages from the queue.
It will fetch the code from remote repo, update the task status to "STARTED" and then start execute it
We have a monitoring service that regularly send heartbeat check to those workers. The service also maintains a mapping between execution job and worker_id
if any worker died unexpectedly, failed in 5 consecutive pings for ex, the monitoring service will swap the instance with a new one, then assign those tasks from old task to new task, so it can start execution
If any task failed to execute after 5 attempts, we will move the execution into a DLQ
When we work on the message from the queue, we will remove them from the queue, so no other workers will work on the same task. It also means that we must either finish the task or put them into DLQ once we polled them from queue
If the task is successfully executed, we will update the task execution table and mark the status of success
For scaling, since we used a queue to decouple the execution workers and scheduler workers, so we can scale them separately. For the message in the queue, we can partition them by execution id, so tasks will be evenly distributed
For high priority tasks, we can maintain a dedicated queue/resources for processing high priority tasks.
For message queues, we can use SQS/PubSub or Kafka
For DB, we can use NoSQL DB like Cassandra, ScyllaDB
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?