Requirements
Functional Requirements:
- Users can see available shows.
- Users can view a seating map to pick seats.
- Users can select seats and make payment to book those seats.
- Users can be notified upon booking confirmation via SMS, email and whatsapp
- Users can view booking history upto 1 year.
- Admin persona can manage cinemas - add/remove/update
- Cinema Owner persona can manage shows - add/remove/update
Non-Functional Requirements:
- System should be reliable
- System should be fault tolerant
- High availability is expected
Capacity Estimation
Estimate the scale of the system. Consider daily active users, read/write ratio, storage requirements, bandwidth, and any relevant QPS calculations...
In a country where we have
100 cities and each city on average has 100 cinemas and each cinema is running 10 shows so overall running shows in a day are 0.1 M.
Consider on an avg 50% shows are booked and each show is having 500 seats and each booking has 2 seats booken then
in a day, the final booking transaction traffic is 12.5M bookings.
Consider for each booking, the user visits 5 theaters for 5 movies before making the final transaction then QPS would be 12.5M * 10 = 125M queries per day resulting roughly 125M/0.1M = 12.5M QPS
Network bandwidth -> request contains 1 kb of data -> incoming traffic = 12.5 MB
if response are generating 10kb/request then outgoing traffic = 125 MB
How many Cores are required to support 12.5M QPS
Let's say if 10000 request can be handled by 1 core and if we want to keep 25% buffer then
required cores are = (12.5M/10k)*(25/100) = 1.25M * 25 = 31.25 = ~32 core
Storage - if 1 transaction is holding 10KB of data then for an year the required storage would be-
12.5M * 10 * 365 = 45GB
API Design
Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...
to list all cities
GET /api/v1/city
to list cinemas under a city
GET /api/v1/city=?/cinema
to list shows under a cinema
GET /api/v1/cinema={}/show
to confirm booking
POST /api/vi/city={}/cinema={}/show={}
Params - {seat id : []}
to view booked history
GET /api/v1/bookingID={}
High-Level Design
Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.
Admin and cinema owner can maintain cinema and show data in Cinema DB
User can view available shows in show DB
Caching is introduced for hot and blockbuster movies.
After choose show and seats, user would book seats via booking service.
Payment service is responsible for payments via PSPs.
Kafka queue is used to send async notifications.
Booking DB will have read replica through which user can show booked details via opening a url. so all of that read transactions would be done via read replica.
Database Design
Define the data model. Identify the main entities, their attributes, and relationships. Consider the choice of database type (SQL vs NoSQL) and justify your decision based on access patterns...
Cinema DB, Show DB and Booking DB would be SQL database.
Cinema and Show DB would see read heavy transactions.
While booking DB would maintain each transaction data.
Detailed Component Design
Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.
the show DB would be sharded based on the hash of each show so that the load would be evenly distributed.
Cross shard queries would be needed only for analytics purpose so I dont expect high load there and can be managed with cross shard queries only. if we see load, then another async service can be introduced which would pull data across shards in the batch and store it in analytics DB such as Snowflake or Vertica.
if one show get popularity then we can maintain more replica of such hot shards.
read replica would help to manage read load on booked transactions.
The list and booking service would be running on multiple nodes cluster with automated failover so it would highly available.
Kafka topic would help to process async non crucial data to notify users.
Concurrency Issue - When 2 or multiple users would select the same seat then whichever user moves first to confirm the selection then based on the timestamp ordering protocol, the seats would be locked for that user until a definitive transaction time. If transaction is cancelled or not successful for that user then the seat would be resolved. If same user try to lock seats multiple times without completing the transaction then it would be considered as DoS attack and rate limiting would be applied for such user.
Peak time handling - List service and booking service would be running with auto scaling enabled which means more instances of the service pod would be available to serve customers.
High availability - The second instance of DB would be available in case of failover. There would be strong consistency between both instances of booking DB. However eventual consistency can be considered for cinema and show DB instances.