Design Ticketmaster - System Design

System requirements

Functional:

MVP

Search for events

View an event and seats

Book a seat for an event

Non-Functional:

MVP

Scalability - We have view events >> booking events, i.e. read >> write, so it's a read heavy system. We need to prepare for surge traffic for popular events

Consistency - no double booking

Availability - for search and view events

Latency - low latency for search

Durability - be able to persist booked seats, historical data

Capacity estimation

2 million booked tickets per day

2 000 000 / 100000 = 20 qps

1:10 read and book ration

200 qps for view events

1 billion tickets in total

500 bytes for the metadata per ticket

500 * 1000 000 000 = 500 GB

events/performer data should be much smaller than the tickets data

API design

Search_Events(search_term, location, date)

term can be event_name, performer_name, ....

POST /search?search_term={term}&location={location}&date={date}

returns a paginated list of events

View_event(event_id)

GET /event/:event_id

returns event_name/location/performer/dates/list of tickets

reserve_ticket(event_id, ticket_id, user_id)

POST /booking/reserve

header: JWT | sessionToken (includes user_id)

body: {ticket_id}

returns 200

or returned a reservation_id. reservation_id* is used to track the reservation status and make the API calls idempotent

confirm_ticket(reservation_id, ticket_id, payment_info)

PUT /booking/confirm (update a reservation, not create a new one, so PUT is better than POST)

header: JWT | sessionToken (includes user_id)

body: {reservation_id, ticket_id, payment_details (stripe)}

returns 200

Database design

Events table:

event_id

performer_id

location_id

date

Ticket table

event_id

ticket_id

seat_id

date

status: available/booked

Reservation table

reservation_id

ticket_id

event_id

date

payment_info

For table choice, sql database like postgres will be better, as the system has clear structure of data models (events, tickets, performers, etc.) and the entity relationship. Plus the sql database provides ACID and strong consistence guarantee , so we can use sql database in this design (like postgres)

High-level design

Search events

User types keywords about the event, the request goes to the events service, which will look up the events table to find the related events, and return a list of paginated events.

To speed up the search experience, we can use elasticsearch that provides inverted indexing and put static events metadata info in the cache

To sync data between elasticsearch and events table, we can either do it at application layer with double writes or use CDC to apply database updates to the elasticsearch

View event tickets

After use select an event from returned event list, the booking service will return details about that event, and tickets info. Tickets info is fetched from the ticket table. Each ticket is associated with a seat. For each ticket, we will also return its status, like booked or available

Book a ticket

Booking a ticket it a two phase process. User firstly click a ticket into the reservation process. The booking service will return a reservation_id that helps tracking the reservation process and makes following calls idempotent.

Since we want to avoid double booking issue, we will reserve this ticket for users for 10 mins. Within this period, this ticket is unavailable to other users. After that, if user still has not paid the fees, we will release the ticket.

To release the ticket, we can have two options

1) Introduce a ticket status like pending, with the last_updated_timestamp. Then we implement a cron job that regularly scans the table and updates the status for reserved ticket that has been pending for 10 mins

2) use a lock cache in redis. After a ticket is reserved, we will put that ticket into the cache with a TTL. When returning the available tickets, we will look up the cache and filter out those tickets that present in the cache.

Using cron job may not provide up-to-date info between two scan jobs, but it would have the risk that in case the cache is crashed, we will lose the pending reservation data, and may cause double-booking issues. In that case, we should have a ticket status check for each reservation query so that we will not double book a same ticket from two different users

Confirm a reservation

After user provides payment info, we will send the payment to a third-party like stripe that processes the payment info, is succeeded, we will update the ticket status info in the ticket table to be booked, and then stores the reservation info the confirmed reservation table.

In case of a popular event that we may have million of booking events, we can implement a waiting queue before going to the booking service. The queue can be implemented with the sorted set in redis, so it respects the first come first service policy. In that way, we can control the number of reservation activities

In general cases, we can use SSE to push real-time booking info to the client, so the event info in the client side is always up to date

Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...

Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?