System requirements


Functional:

  • Provide a service that prompts users for their car type (compact, regular, etc...) and issues them a ticket if they wish to use the parking lot
  • Support multiple customers simultaneously and don't allow more cars in the parking lot than there are spots as this would be extremely bad user experience
  • Allow users to make advanced reservations
  • Allow users to exit parking lot and settle their bill at this time


Non-Functional:

  • Need to make system highly available, otherwise it could lead to angry customers that either can't enter or exit the premises
  • We need a secure payment option so we can protect users credit card, personal details etc...
  • We assume their will be on the order of 10 kiosks and we should be able to process all these simultanously




Capacity estimation

  • Assume parking lot is 1000 spaces and we have 24000 entry/exits per day. Assume half of those are allocated for reservations (500 spaces) and the other half (500) are done in-person via kiosks. Assume the average entry time is 1hr (this will be a city lot). Furthermore assume 400 compact spaces, 400 normal size and 200 extra large size, half going to online reservation and other half for the kiosks.
  • The online reservation system should support 96k requests per day (roughly one quarter result in actual purchases). 96k / (24 * 60 * 60) = 1.1 requests per second. We allow for three orders of magnitude higher, thus around 100 requests per second. We will rate limit more requests than that.
  • Our online reservation will maintain state of parking lot in a distributed cache (Redis) for information about each parking category: current status (used/unused) (2 bytes), drivers' license plate (20 bytes), entry time (4 bytes), planned exit time (only valid for reservations) (4 bytes). Thus, we need around 30bytes * 1000, which is 30kb. We will replicate this date
  • For online reservations, we will store user information. While we use SSO for authentication, we store user profile and their reservations. We will support around 20k users though this could grow to 100k. If we assume around 500 bytes per user we need another 1mb
  • For bandwidth requirements, we assume 100k browsing sessions a day and assume we serve about 100kb of data queries total so around 10 gb per day, thus around 115kb per second.




API design

  • reserve_space(car_type, time, user_id): This API is called when user wants to reserve a space online. It will check availability at this time
  • obtain_walk_in_space(car_type): This API is called for those wanting to enter parking lot. The return is a reservation_id, which gets printed onto a ticket used for when customer exits.
  • obtain_remaining_walk_in_spaces(): This API returns number of spaces available for each category and is used for displaying in kiosks and on signs for cars driving up. This allows for more friendly user experience, for setting expectations in the area (lower numbers might reduce shock if number of spaces get used up)
  • exit_parking_lot(reservationId): This returns a redirection to payment mechanism which requires user to make a payment for walk-ins. For reservations, it checks payment has already been settled. A token confirming payment for reservation is sent to user
  • open_exit_parking_gate(reservationId, reservation_payment_token_confirmation). This is required for exiting the lot. Upon success, the number of available spots (for walkins) will increase.



Database design

  • High consistency is needed and we are dealing with fairly small dataset sizes, therefore we will use SQL store. To improve availability, we will use a replicated store. When we make reads, it will be routed to one of the replicas if we don't care strongly about data consistency (for example if checking how many available space in the dashboards). However, for reads that require consistency, we will require quorum reads. We will shard the DB by parking_lot_uid and user_uid (each table is sharded)
  • Table1 will be for storing resources: parking_lot_type, parking_lot_uid, is_reservable.
    • parking_lot_uid is the primary key, is_reservable is for configuring if it can be reserved. parking_lot_type indicates size of parking.
  • Table2 will contain users:
    • user_uid, user_name, etc...
  • Table3 will contain the reservation info (shard by parking_lot_uid):
    • user_uid, start_time, end_time, parking_lot_uid
  • We will cache status of parking lot in distributes stores at all times. We don't need to store this in the DB itself. Will use Redis caching and persistence mode, in case of failures. Cars entering and exiting will update the cache. Users checking for status will update the cache.
  • Users entering premises without an account wont use the reservation system and shall only update the cache



High-level design

  • We will have load balancer routing to frontend server
  • Front-end servers will route traffic to: (1) Redis for checking current state availability of walk-ins, (2) Booking system for walkins which include payment services, (3) Redis for updating current state of parking lot (i.e. count of number of cars and car types), (4) Booking reservation system
    • Front end requests to booking reservation system will first redirect users to authentication SSO IDP
    • Front end servers will be stateless and replicated across availability zones and regions
  • Redis will be used as a distributed, replicated cache with persistency enabled and ensure blazing fast response time for those entering/exiting
  • The booking system for walkins will use a distributed lock, using Redis, when a request is in progress. The lock will allow the corresponding counter on a parking lot type to be decremented, thereby preventing multiple clients from using the same spot (we don't want to allow cars in the lot without guarantees). The booking system will have a 2 minute timeout and exit gracefully, if reservation cannot be made in time.
  • The payment service will be implemented using a third party solution, i.e. using Stripe and Paypal. If payment fails to go through we can ask the customer to try paying again.
  • The reservation system will also make use of the real-time status in Redis. However, it will also use SQL tables to persist reservations and check reservations. We require transactional support on all reservation requests so we use the sharded replicated SQL DB. We will also make use of a lock for when users are making reservaations.




Request flows

  • Parking lot status requests issue requests to LBs, which forward requests to FrontEnd servers which route requests to ParkingLot status servers which route requests to Redis cache
  • Walkin requests issue requests to LBs, which forward requests to FrontEnd servers which route requests to Booking Walk-in servers which will use a distributed lock and decrement the counter tracking available walk-ins (i.e. in the Redis cache), while an entry is in progress. The booking walk-in servers shall generate a unique session id, prints it on ticket to user, persists that unique session id to Redis and allow user to enter premise. If there are any issues or user backs out without entering the lot, the counters can be incremented.
  • When walkin users want to pay for their stay, this unique identifer is presented and used against the cache to determine the start time and calculate the total duration and cost. The payment is initiated and user is prompted for credit card. External payment system such as Stripe or Paypal is used. Upon payment, the Cache is updated with an exit-by-time attribute, by which the user must exit the premises otherwise additional payment is required. The cache also indicates payment was completed.
  • When walkin users want to exit, they present their UID (this can be part of the previous flow) and if payment was successful and if current time is less than the exit-by-time, the gate is lifted, user exits and the counter is incremented.
  • For reservations, users are authenticated before they can query available spots to reserve. When they want to make a reservation, an SQL transaction is initiated, which decrements counters and availabilities for that time frame. There will be an expiry time on the reservation. We will persist these expiry time reservations in a standalone table and will deleted these after payments are confirmed. There will be a cleanup background task checking on these transactions and if expiry time is exceeded, will clean up by freeing resources.
    • For now, lets not allow reservations to be changed.





Detailed component design

  • The Redis cache is an important aspect of design, ensuring fast response time on parking lot status. Due to its replicated nature and persistence options enabled, we ensure high availability, reliable, performant and scalable. The number of parking spots for a single parking lot is small, so this is a viable approach.
  • The reservation system uses SQL for ensuring transactional support as we don't want to overallocate. We pre-emptively reduce resources while a reservation is in progress, to ensure users dont have bad user experience due to race conditions where multiple users reserve simultaneously. Our background task will cleanup, ensuring if catastrophic failures occur, we perform neccessary cleanup.
  • All servers are stateless and replciated, ensuring we have ability to scale.
  • Security:
    • All servers require mTLS across all server access. Externally, we require SSO to authenticate to the reservation system. ON the kiosks, we don't require user authentications.
    • We don't store any sensitive user information, though we still encrypt all data at rest. For example, SQL server has master encryption key.





Trade offs/Tech choices

  • Choice to use SQL DB allows transactional requests. We require high consistency at the cost of performance. To mitigate, we can shard. For performance, we cache data.
  • We decide not to implement our own Authentication or payment services as these can be expensive.
  • We use Redis cache for fast access though configured it for persistence and replication for ensuring reliability
  • We are pessimistic with our parking lot status counters, to ensure that users aren't mislead into believing there are availabilities up until the last minute to find out that they aren't. This requires locks on resources, potentially slowing performance down a bit.





Failure scenarios/bottlenecks

  • Failure to pay will result in allocated resources that need to be unallocated. We factor this in and even have a background cleanup tasks to ensure we can recover in some of these error scenarios.
  • Locks on counters can be expensive but we are dealing with a fairly small scale and can absorb this cost. To scale to larger loads we would partition our counters to ensure we don't have bottlenecks that would degrade user experience.



Future improvements

  • To improve/scale, we can shard our counters and reduce lock times.
  • We can add load balancers in different geographic zones too, for adding additional redundancy.
  • We can also add support for dynamically changing the number of walk-ins versus reservations. Our design pre-allocates this but in future, we can have intelligence allowing more walk-ins if there are no online reservations.