Codemia | Master System Design Interviews Through Active Practice

My Solution for Design a Fitness Tracking App with Score: 8/10

by nebula_jubilee499

System requirements

Assumption here is that users will be inputting their data manually into the application to record their diet, physical activity and other relevant metrics.

Functional:

Register users with service
Allow users to define/allow notifications/reminders of certain daily events tied to goals
Allow users to record/track goals for their personal benefit
Physical activity goals should be based on some frequency within a defined interval, intensity and duration per session
Diet goals should be based on consumption of nutrients and other micrometrics (calories, proteins and etc.)
To track progress for goals and user fitness in general historical data will be required to track so that users may review this information at a later date
Likely a mobile application that must be developed in iOS/Android

Non-Functional:

Scalability

Fitness apps should be usable by anyone and anywhere with data being tracked and passed to our hub via the internet when connection is available. As such, it will require horizontal scaling in order to accommodate for a growing user base across a variety of nationalities.

As such, we can assume the daily users of our application to be anywhere between 50 - 100 million.

Availability

Our data should be readily available for users to query especially when reviewing details such as goals or historical metrics pertaining to a given user.

Possible Error cases to consider:

Network failures
Datacenter outages

Capacity estimation

As mentioned in the previous segment, the expected number of daily users for this application can be between 50 - 100 million users.

The number of requests that we will handle per second however will include various events such as goal setting, physical activity reports, user registrations, requests for historical data, updates to specific goals for progress or shifting goalposts and etc.

As such it can be estimated that on average we can see about 20 million requests per second.

The incoming requests sizes will typically be on the smaller side, involving common features such as the user_id, goal_id, and date range which would sum up to be roughly 50 bytes per request for the largest sects.

This means that on average amount of incoming data per second will be 10 MBs per second.

API design

On the API side of things, leveraging something like websockets is an option, but is not entirely required as small sects of data like current goals can be tracked on a user's device and leveraged to generate notifications directly as opposed to increasing traffic on our end for such matters. As such, using REST APIs is perfectly serviceable for our needs. The following calls will be leveraged:

register_user(username, password(hashed), date_of_creation, device_type (this is for special cases like betas and such in the future)

set_fitness_goal(user_id, activity_type, interval_frequency, intensity, duration, notify)

Note: the last three items can be looped into a nested JSON type object for simplicity in handling

set_dietary_goal(user_id, dietGoalJSON (this can contain information about a variety of things like frquency of meals, intake of nutrients, time range and so on)

start_workout(user_id, exercise_type, start_time) -> Returns some id that will be used to mark the end of a session

end_workout(session_id, end_time)

update_workout_goal(goal_id, fitnessGoalJSON)

update_dietary_goal(goal_id, dietGoalJSON)

progress_dietary_goal(goal_id

send_workout_progress(sessionId, statsJSON)

StatsJSON contains details like current date, current time, intensity, duration based on the workout_type. This data is forwarded to the daily info table which will be aggregate and looped into some daily data that will be passed to a data lake where it will be stored for future review
Note that this will be done automatically at specified intervals from the registered device instead of manually triggered.

get_historical_workout_data (user_id, start_date, end_date, workout_type(optional)

get_historical_dietary_data(user_id, start_date, end_date, data_type(optional))

Database design

Consistency is a little less of a concern here since network failures and other error states can result in situations where data is not correctly reported by a user for some period of time until manually corrected or error state is resolved. As such availability of said data for quick reads is critical. In addition to this, due to the high volume and variety of data we are receiving it will likely be good to introduce some layer of partition tolerance that will be malleable for an incoming surge of consumers who may be using our product.

As such MongoDB would be a good option in order to split our data into various Documents that can quickly be read from.

The following Documents will be leveraged:

User Document:

user_id
username
password (hashed)
creation_date
device_type(optional)

Physical Fitness Goal Document

goal_id
user_id
fitnessGoalJSON
itensity
duration
interval frequency
activity_type

Dietary Goal Document

goal_id
user_id
dietaryGoalJSON (smaller object groupings inside)
goal_weight
nutrient_type
intake_per_day

Fitness Session Summary Document

session_id
user_id
session_type
Intensity (speed/weight lifted)
start_time
end_time

Fitness Session Progress Document

session_id
session_type
timestamp
current intensity/progress
event_type

High-level design

Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...

See sequence diagram

Detailed component design

Further depth can be made on the API design used for the setting and update of goals of either fitness or dietary in nature.

As mentioned in the API design segment the structure would look roughly as follows for an example dietary goal:

{

user_id: 111111122222223333,

dietGoal : {

proteinGoal : {

intake: 100g

frequency: daily

weight_goal: 150 lbs

}

As can be seen within the object there are many minor subgoals that can pertain to various nutrients specifically while also keeping overarching goals such as weight in mind for reference. However, because the schema is inconsistent as a result of this, mongoDBs Document-> Collection Structure servers as a simpler way of storing such information.

Trade offs/Tech choices

One of the biggest trade offs made in the situations was the leverage of Mongo DB in place of a traditional RDBMS system.

The reason Mongo was used was due to its high availability and tolerance to partition. Given the daily logs for an given consumer can consists of thousands of session logs alone, that memory consumption can wind up adding up and resulting in traditional RDBMS systems in requiring more and more frequent partitions throughout its lifetime. In place of that MongoDB can easily distribute data amongst its various redundant copies and remain easily accessible despite this.

This however comes at the cost of consistency. While MongoDB is an eventually consistent DB type, the fact remains the each event coming in will not be immediately ready for reading which means there is a required buffer period when looking up historical data, and outside of session data logged on ones device, overall historic data will not be visible up until the present until the buffer period has been accounted for.

Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

The first major failure scenario involves a lack of network connectivity for the device that may currently be tracking an ongoing session. This sort of scenario tends to happen particularly during hikes and the like meaning that the intermediary data and events would need to be logged in a queue within the device that can hold them for some period of time before sending them out for processing once network connectivity is regained.

Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

There is room currently within the application in order to leverage Machine Learning Models which can then provide users with personalized suggestions in regard to goal setting, possible outliers in historical data, and other such analytical information in regard to the user's health. ML can also be used to quickly enter data regarding dietary goals set by users as well. For example, if a user were to scan an ingredients/nutritional information list of foods they consume in a given meal, that information can be quickly uploaded into a form that can quickly be reviewed before sending progress updates.

Another major feature improvement that can be conducted is the integration of third party devices such as Apple Watches, Fitbits and the like in order to effectively remove any manual user input outside of the bare minimum such as when it comes to setting goals or updating user information. All other updates and logging would be done automatically.