System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

  1. User publish messages, follower fetch the publisher's messages.
  2. User can favor a message.



Non-Functional:

List non-functional requirements for the system...

  1. Performance
  2. Scalability
  3. Availability




Capacity estimation

Estimate the scale of the system you are going to design...

  1. 1million users, 10 messages per hour, 200k QPS
  2. 10K per message, 600T per year.





API design

Define what APIs are expected from the system...

  1. tweet: http://XXX/send, arg: text: String, userID:id
  2. fetch: http://XXX/fetch, arg: userID:id, length: int





Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...

  1. userDB: {userID: Int, userInfo: UserInfo}
  2. FollowerDB(graphDB): Node: userID, edge: follow
  3. MessageDB: {messageID: int, text: String}




High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...







Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...

  1. Client connect to Server Using WebSocket.
  2. Send message
    1. Message go to publishService, the message will be stored in Database and cache.
    2. Message go to fanout service, new messages will be send to a MQ, based on the followe Graph, each message will be send to the follower's webSocketQueue and send to the client
    3. Message will also be send to client via Notification service for offline clients
  3. Fetch message
    1. For just login clients, or clients want to fetch historical messages, clients need to get data from cache and database;
  4. For like operations, we have a separate DB to store the message and like counts, this information will be fetched and send to clients on fanout.



Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

  1. For fanout service, new messages come in, will be send to the MQ to decomple the components.
    1. WebSocketQueue will get the message that the client is following, and send to the client.
  2. On client login, he will also fetch the latest messages from the MQ.
  3. For historical data, if MQ's message can not fulfill clients' request, then we check DB and cache for older messages.





Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

  1. For DB, to support better performance and availability and scalability, we choose NoSQL
    1. It's using HashRing to shard the requests, so for more requests, we can add more servers to hold the data and requests.
  2. For Client connections, we are using WebSocket, so that it's easier to communicate and fetch updates from Server.



Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

  1. PublishService, cache, databse, fanout, MQ, FollowerGraph, notification can have server issues, we can have multiple instances to hold the requests.
  2. For hotSpot users, we might not trigger fanout for these users, but let followers fetch updates actively.




Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

  1. we can support messages with different media types
  2. Client side can cache some messages and timestamp, so that we can filter the fetch.