System requirements


Functional:

  1. Design a internal payment system for an e-commerce application that integrates to a PSP(Payment Service Provider)
  2. The payment methods supported - Credit card, debit card, wallet, QR 
  3. Do we need support international transaction - yes
  4. How many transaction per day - 100M
  5. Anything else? - Yes, Reconciliation
  6. Both pay-in and pay-out functionalities


Non-Functional:

  1. Reliable, consistent 
  2. Low-latency
  3. Fault- tolerance


Capacity estimation

  1. Writes 100M per day = 10*10^6/10^5 = 100 per sec


API design

  1. v1/payments/event - POST

Req : checkout_id, user_info, payment_details

Resp: status - success/failure

  1. v1/orders/{order_id} - POST

Req: order_id, payment_details, amount

Resp: status

  1. /v1/ledger - POST

Req: user_id, amount, transact_type: credit/debit

Resp :

  1. /v1/wallet - POST

Req: seller_id, amount, transact_type: credit/debit

Resp: status


Database design

  1. The common DB involved in this design is payment_event, payment_order, ledger, wallet are the primary dbs

Payment_event:

  1. Checkout_id
  2. User_info
  3. Payment_details
  4. amount
  5. Is_payment_done
  6. Is_ledger_updated
  7. is_wallet_updated

Payment_order:

  1. Order_id
  2. User_info
  3. Payment_info
  4. Amount
  5. Seller_id

Ledger:

  1. ledger_id
  2. user_id
  3. amount
  4. transact_type

Wallet:

  1. wallet_id
  2. Seller_id
  3. Bal_amount


High-level design

  1. The Core services with the payment system are 
    1. Payment service
    2. Order Service
    3. Ledger Service
    4. Wallet Service
    5. PSP-third party service
  2. Few other services are reconciliation service
  3. Analytics
  4. Billing service etc,.


Request flows

Defined in the diagram.





Detailed component design

  1. Payment service
  2. Order Service
  3. Ledger Service
  4. Wallet Service
  5. PSP-third party service
  6. Few other services are reconciliation service
  7. Analytics
  8. Billing service etc,. 

But due to time constraint I will cover the core services first and touch upon the other services at the high level.

Payment service:

  1. This is the service which receives the payment event from checkout page.
  2. A single checkout request have multiple orders each belonging to different seller.
  3. The split of event to multiple orders takes place in this service.
  4. First it logs the payment event in events_db and sends the individual order to order services.
  5. This is the services that orchestrates the whole payment flow, which is response of updating the payment_status field in event_db once all the orders are completed successfully
  6. It talks to Ledge service and Wallet service

Order Service:

  1. This is the service that talks to PSP and make sure the order is successful
  2. It logs the individual order in orders_db before sending the request to PSP

Ledger Service:

  1. Ledger services keeps track of double -entry book keeping that is it hold immutable entry of both debit and credit for a single transaction

Eg: user A - Debit $1

User B - Credit $1

  1. At the end the total of this table has to be zero, if not auditing has to be raised.

This talks to ledger_db

Wallet Service:

  1. This the service that maintains the account of seller balance. The sellers are not payed immediately once the order is placed, they are paid once the product is delivered.
  2. The platform fee is deducted and rest to payed to seller something like monthly basis or when the seller raise a request for pay-out.
  3. This service talks to wallet_db


Trade offs/Tech choices

Relational database would be a best option.





Failure scenarios/bottlenecks

Ensuring Scalability:

  1. To avoid tight coupling and increase the latency the communication between the services are async, the order message is placed in a queue and a service process the msg in queue and sends it to PSP.
  2. Later the PSP reply back to the Webhook Url, agreed during the setup
  3. Few PSP also provide polling mechanism to retrieve the status, but that cause overload to the PSP system, but can be used in-cases to check whether the transaction is success or not

Avoid Double Payment and ensuring reliability:

  1. In Payment System - ensuring exact only delivery of payment is ut-most important, its the core requirement of the payment system design. This is achieved using idempotency and retry
  2. exact once payment: 
  3. At-least once - by retry mechanism
  4. At-most once - Idempotent key

Available retry mechanism are 

  1. Fixed interval retry
  2. Incremental interval rery
  3. Exponential backoff - best one
  4. To achieve achieve idempotency checkout_id is sent to PSP and a token is received for it, even before the actual payment request is done.
  5. This when the user double-clicks the pay button or response from PSP is not reached the client and same request is re-sent - the same token will be sent from checkout page and hence the PSP will be able to identify the older requests and respond accordingly


Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?