Design Twitter - System Design

System requirements

User creates an account and the information gets stored to user table which is a MySQL table.
User posts a tweet. This post is saved to Posts DB which can be MySQL DB as post text is fixed. Then we also need to send this post to all of users followers. To achieve this we implement a CDC that captures changes in posts D and pushed them to a Kafka queue. From this Kafka a Flink service obtains posts.
When user follows another user, we store this in Following MySQL DB, also then CDC which send to another Kafka which send to previous Flink. Now the flink will have all followers of users and posts of user. Flink can use this to update news feed for all followers of a user. On average a user has 100 followers for we need to update 100 newsfeed
When a user adds a tweet to favorite, it is stored to favorites mySQL DB.
For users with million followers we don't update news feed cache but instead use a polling technique from the client device to get news feed item

Kafka is sharded by user id so that Flink has all the data for a user including the new post, and followers information.
We have replicas for each DB which are partitioned. We use consistent hashing to split partitions evenly and make them fault tolerant and durable
We horizontally scale web servers and application servers as per user growth.
Since news feed is cached, we can provide low latency.

Explain any trade offs you have made and why you made certain tech choices...

Try to discuss as many failure scenarios/bottlenecks as possible.

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?