System requirements
Functional:
- Task creation - create/delete
- Task execution - start/stop
- Tasks orgnization - one task can be executed after the completion of previous task
- Task prioritization - critical tasks can be executed ahead of less critical tasks
- User notification - completion, failure, retrial, etc.
- Latency configurable - User can define an acceptable latencies for the task execution.
Non-Functional:
- Fault Tolerance. Implement mechanisms for seamless failover to handle master or worker node failures without service disruption
- Scalibity. System should be able handle thousands of tasks being executed concurrently.
- Reliability. System should be highly reliable to make the tasks executed as expected.
Capacity estimation
- Task creation 1000/s
- Task execution 10000/s
- Concurrent tasks assume each task take 5 mins, in total 3 million tasks raunning concurrently
- Average size of data that each task created is 1 kb, so it would be 10 MB data created per second and 10 * 60 * 60 * 24 = 86400MB = 864 GB per day
- If we store data for 2 years, it would be 864 * 365 * 2 = 631 TB
- Bandwidth usage: 10 MB * 8 * 3 = 240 Mbps
API design
Login
Post /api/v1/user/login
Params userName/email, password
Return status, accessToken, refreshToken
Task creation
Post /api/v1/task/create
Params accessToken, taskName, taskDesc, maxRetries, userId, priority, associations (upstream or downstream of current task), payload, code to be executed (passed through page, could be a sql query or a code snippet)
Task execution
Get /api/v1/task/execute
Params accessToken, taskId, dateTime
Task cancelation
Get /api/v1/task/cancel
Params accessToken, taskId
Database design
Task metadata table
taskId, taskName, payload, maxRetries, priority, associations, taskDesc, codeSnippet, createTime.
Task execution table
taskId, startTime, stopTime, duration, status, errorMessage
Based on our requirements, there would be 3 million tasks executing concurrently, so I would prefer Non-relational database (Cassandra or MongoDB) over Relational database for our design
To futher speed up the processing time and ensure minimal latency, we can store the data into Cache such as Redis or Memcache.
We also introduced Elastic Search to basically storing the log data in terms of visualizing through Kibana, so we can easily check the logs and find out the issues.
High-level design
Show as diagram
Auth Service - Handling token generation
Scheduling Service - Receive and process task creation and execution requests, persist the metadata into database and publish the task to the Message Queue
Execution Service - Consume the tasks from Message queue, process it based on priorities and dependencies.
Notification Service - Notify users with exception logs If the task failed
Sync Service - Managing the real-time updates between database and cache
Monitor Service - Gather system metrics, performance metrics and reports to Grafana
Request flows
Authentication flow
User made a request -> Rate limiter check the limits -> forward to Auth Service if passed -> generate token and return
Task creation flow
User created a task -> Rate limiter check the limits -> Load balancer routes to Scheduling service if passed -> token is valid, persist the metadata into Database
Task execution flow
User started a task -> Rate limiter check the limits -> Load balancer routes to Scheduling service if passed -> token is valid, push the task to the queue, update the status to executing
Executor service take the task and put into a internal priority queue -> poll out the priority one and check dependencies -> depends on other task, check the status of the upper task -> start processing if upper task finished, otherwise make this task on hold and process the other one.
Detailed component design
1.At most once
We should set up the message queue to make the message consumed at most once, that means one same task would be executed only once by a same consumer.
2.Manage priorities
In one node: once message consumed by the Executor service, it would be sent into a priority queue. If available service would poll the message out and process it based on its priorities.
Multiple nodes: each service does not aware of the priorities for others, so their would be the possibilities that low priority tasks executed before the high priority one.
We can leverage Redis ZSet to make the tasks ordered by its priorities, Execution service would fetch the status from Redis before starting processing the tasks.
3.Manage dependencies
Before processing a task, Executor service would validate the dependencies first, and check whether the dependencies finished or not. If haven't yet finished, make the current task on hold, otherwise process the task.
Multiple Execution service would ensure the sequence of the execution by fetching the data from Redis
Trade offs/Tech choices
Consistency or Availability
Make sure a task executed successfully even if some error happens is more important than strong consistency, it's ok that some users might see the updates of tasks a bit later, so we prefer availability over consistency.
Failure scenarios/bottlenecks
1.Task execution failure
Due to consumer crashes - Failed tasks would be redelievered to the message queue such that other consumers can take it and consume again.
Due to logic error - Failed tasks would be retried certain times based on the configuration, if still failed, update the status and notify the user.
Future improvements
1.For any potential issues users can investigate through Kibana
2.We can build a powerful monitor service to be able to visualize the system metrics and aware of the potential issue in adavance