My Solution for Design a Public Transportation System

by nectar4678

System requirements


Functional:

Route Planning

  • Provide optimal routes for various types of public transport (buses, trams, subways).
  • Support dynamic route adjustments based on real-time traffic data.

Scheduling

  • Manage and display timetables for all transportation modes.
  • Allow for real-time updates to schedules based on delays or disruptions.

Fare Collection

  • Implement a digital fare collection system supporting multiple payment methods.
  • Ensure secure transactions and data protection.

Real-time Vehicle Tracking

  • Track vehicle locations using GPS and display real-time updates to passengers.
  • Integrate with passenger information displays at stops and stations.

Passenger Information Displays

  • Show arrival and departure times, delays, and route changes.
  • Provide multi-lingual support.

Multi-modal Integration

  • Enable seamless transfers between different modes of transportation.
  • Provide unified ticketing and scheduling for buses, trains, bikes, etc.

Accessibility

  • Ensure the system is accessible to people with disabilities.
  • Include features like audio announcements, tactile maps, and mobile app accessibility options.

Environmental Impact

  • Minimize emissions through efficient route planning and scheduling.
  • Promote the use of electric or hybrid vehicles where possible.


Non-Functional:

Scalability

  • Handle high volumes of passengers and data without performance degradation.
  • Support future expansion in terms of both users and geographical coverage.

Reliability

  • Ensure high availability and fault tolerance.
  • Implement robust disaster recovery mechanisms.

Performance

  • Provide fast response times for route planning and real-time updates.
  • Optimize backend processes to handle peak loads efficiently.

Security

  • Ensure data security and privacy for all transactions and user data.
  • Implement strong access control and encryption mechanisms.

Usability

  • Design user-friendly interfaces for both web and mobile applications.
  • Ensure the system is intuitive and easy to navigate for all users.

Maintainability

  • Use modular architecture to facilitate easy updates and maintenance.
  • Document the system thoroughly for future developers and operators.

Compliance

  • Adhere to local and international regulations regarding public transportation.
  • Ensure accessibility standards are met.





Capacity estimation

Assumptions

  • Urban Area Population: The system is designed for a city with a population of 1 million people.
  • Daily Users: Approximately 10% of the population uses public transportation daily, resulting in 100,000 daily users.
  • Peak Hours: 20% of daily users commute during peak hours (2 hours in the morning and 2 hours in the evening).
  • Vehicle Fleet:
  • 500 buses
  • 100 trams
  • 50 subways
  • Average Journey Time:
  • Buses: 30 minutes
  • Trams: 20 minutes
  • Subways: 15 minutes
  • Frequency of Updates: Vehicle locations are updated every 10 seconds.

Calculations

  1. Peak Hour Users=100,000×0.2=20,000 users
  2. Users per Peak Hour=20,000/4=5,000 users per hour
  3. Requests per Second During Peak Hours:
  • Assuming each user makes 2 requests per journey (one for departure and one for arrival).
  1. Total Requests per Peak Hour=5,000×2=10,000 requests per hour
  2. Requests per Second=10,0003600≈2.78 requests per second
  3. Requests per Second=10,000 / 3600 ​≈2.78 requests per second
  4. Real-time Vehicle Updates:
  • Total vehicles: 500 + 100 + 50 = 650 vehicles
  1. Updates per Second=650/10=65 updates per second
  2. Storage Requirements:
  3. Assuming each user journey data is 1 KB.
  4. Daily data storage:
  5. Daily User Journeys=100,000×2=200,000 journeys
  6. Daily Data Storage=200,000×1 KB=200,000 KB=200 MB
  7. Yearly data storage (assuming 365 days):
  8. Yearly Data Storage=200×365=73,000 MB≈71 GB

Summary

  • Peak Hour Users: 5,000 users per hour
  • Requests per Second: ~2.78
  • Vehicle Updates per Second: 65
  • Daily Data Storage: 200 MB
  • Yearly Data Storage: 71 GB
  • Database Throughput: 1.94 reads/second, 0.84 writes/second

With these estimates, we can ensure the system's scalability to handle peak loads and ensure smooth operations.





API design

Route Planning API

Endpoint: /api/routes Method: GET Description: Get optimal routes for public transportation. Request Parameters: start_location (string): Starting point of the journey. end_location (string): Destination point of the journey. mode (string): Mode of transportation (bus, tram, subway). Request body: {   "start_location": "Downtown",   "end_location": "Airport",   "mode": "bus" } Response Body: {   "routes": [     {       "route_id": "1",       "mode": "bus",       "duration": "30 mins",       "stops": ["Downtown", "Central Station", "Airport"]     }   ] }


Scheduling API

Endpoint: /api/schedules Method: GET Description: Retrieve the schedule for a specific route. Request Parameters: route_id (string): ID of the route. Request Body: {   "route_id": "1" } Response Body: {   "schedule": [     {       "stop": "Downtown",       "arrival_time": "08:00 AM",       "departure_time": "08:05 AM"     },     {       "stop": "Central Station",       "arrival_time": "08:20 AM",       "departure_time": "08:25 AM"     },     {       "stop": "Airport",       "arrival_time": "08:45 AM"     }   ] }


Fare Collection API

Endpoint: /api/fares Method: POST Description: Process fare collection for a journey. Request Body: user_id (string): ID of the user. route_id (string): ID of the route. payment_method (string): Payment method (credit card, mobile payment, etc.). Request Body: {   "user_id": "12345",   "route_id": "1",   "payment_method": "credit_card" } Response Body: {   "status": "success",   "transaction_id": "67890",   "amount": "2.50" }


Real-time Vehicle Tracking API

Endpoint: /api/vehicles Method: GET Description: Get real-time location of vehicles. Request Parameters: vehicle_id (string): ID of the vehicle. Request Body: {   "vehicle_id": "bus_123" } Response Body: {   "vehicle_id": "bus_123",   "location": {     "latitude": "40.712776",     "longitude": "-74.005974"   },   "status": "on_route",   "next_stop": "Central Station",   "eta": "5 mins" }


Passenger Information Displays API

Endpoint: /api/displays Method: GET Description: Get information for passenger displays. Request Parameters: stop_id (string): ID of the stop or station. Request Body: {   "stop_id": "central_station" } Response Body: {   "stop_id": "central_station",   "arrivals": [     {       "route_id": "1",       "mode": "bus",       "arrival_time": "08:05 AM",       "status": "on_time"     },     {       "route_id": "2",       "mode": "tram",       "arrival_time": "08:10 AM",       "status": "delayed"     }   ] }


Multi-modal Integration API

Endpoint: /api/multimodal Method: GET Description: Get integrated routes across multiple transportation modes. Request Parameters: start_location (string): Starting point of the journey. end_location (string): Destination point of the journey. Request Body: {   "start_location": "Downtown",   "end_location": "Airport" } Response Body: {   "routes": [     {       "steps": [         {           "mode": "bus",           "route_id": "1",           "start_location": "Downtown",           "end_location": "Central Station",           "duration": "15 mins"         },         {           "mode": "subway",           "route_id": "2",           "start_location": "Central Station",           "end_location": "Airport",           "duration": "20 mins"         }       ]     }   ] }




Database design

To address the functional requirements, we'll design the database to handle route planning, scheduling, fare collection, real-time vehicle tracking, passenger information displays, and multi-modal integration. We'll use a relational database and create an Entity-Relationship (ER) diagram to illustrate the relationships between different entities.


Entities and Attributes


User

  • user_id (Primary Key)
  • name
  • email
  • password
  • role (admin, passenger)


Vehicle

  • vehicle_id (Primary Key)
  • type (bus, tram, subway)
  • capacity
  • current_location (latitude, longitude)


Route

  • route_id (Primary Key)
  • mode (bus, tram, subway)
  • start_location
  • end_location
  • duration


Stop

  • stop_id (Primary Key)
  • name
  • latitude
  • longitude


Schedule

  • schedule_id (Primary Key)
  • route_id (Foreign Key)
  • stop_id (Foreign Key)
  • arrival_time
  • departure_time


Fare

  • fare_id (Primary Key)
  • user_id (Foreign Key)
  • route_id (Foreign Key)
  • amount
  • payment_method
  • transaction_id


Vehicle_Tracking

  • tracking_id (Primary Key)
  • vehicle_id (Foreign Key)
  • timestamp
  • location (latitude, longitude)
  • status (on_route, delayed, etc.)
  • next_stop (Foreign Key)


Display

  • display_id (Primary Key)
  • stop_id (Foreign Key)
  • info (arrival/departure times, delays, etc.)





High-level design

  • Client Applications:
  • Mobile App: For passengers to check schedules, plan routes, and make payments.
  • Web App: For passengers and administrators to access the system's features.
  • API Gateway: Manages and routes incoming API requests to appropriate services.
  • Route Planning Service: Calculates optimal routes based on real-time data.
  • Scheduling Service: Manages timetables and updates schedules.
  • Fare Collection Service: Handles fare transactions and payment processing.
  • Real-time Vehicle Tracking Service: Tracks vehicle locations and updates statuses.
  • Passenger Information Display Service: Provides real-time information for displays at stops and stations.
  • Multi-modal Integration Service: Integrates routes and schedules across different transportation modes.
  • Database: Stores all relevant data for users, vehicles, routes, schedules, fares, and tracking.
  • Notification Service: Sends alerts and notifications to passengers regarding delays, changes, etc.






Request flows

Route Planning Request Flow


Scheduling Request Flow



Fare Collection Request Flow



Real-time Vehicle Tracking Request Flow


Passenger Information Display Request Flow




Detailed component design

Route Planning Service

Description: This service calculates optimal routes for users based on their start and end locations, considering real-time traffic and vehicle data.


Data Flow

  1. Route Request Handler: Receives route requests from the API Gateway.
  2. Route Optimization Engine: Calculates the optimal route using the current traffic and vehicle data.
  3. Traffic Data Integrator: Fetches real-time traffic data.
  4. Vehicle Availability Checker: Checks the availability and status of vehicles.
  5. Database: Stores and retrieves route information.

Route Planning Algorithm

  • Dijkstra’s Algorithm: Used for finding the shortest path between nodes in a graph.
  • Real-time Data Integration: Adjusts routes dynamically based on live traffic updates and vehicle statuses.

Scalability

  • Microservices Architecture: Each component can scale independently.
  • Caching: Frequently requested routes and traffic data can be cached to reduce computation time.
  • Load Balancing: Distribute incoming requests across multiple instances of the Route Planning Service.


Real-time Vehicle Tracking Service

Description: This service tracks the real-time location of vehicles and updates their status.



Detailed Component Design

We'll delve deeper into three key components: Route Planning Service, Real-time Vehicle Tracking Service, and Fare Collection Service. Each component will be detailed in terms of architecture, data flow, and scalability.


Route Planning Service

Description: This service calculates optimal routes for users based on their start and end locations, considering real-time traffic and vehicle data.

Route Planning Service Architecture

Data Flow

  1. Route Request Handler: Receives route requests from the API Gateway.
  2. Route Optimization Engine: Calculates the optimal route using the current traffic and vehicle data.
  3. Traffic Data Integrator: Fetches real-time traffic data.
  4. Vehicle Availability Checker: Checks the availability and status of vehicles.
  5. Database: Stores and retrieves route information.

Route Planning Algorithm

  • Dijkstra’s Algorithm: Used for finding the shortest path between nodes in a graph.
  • Real-time Data Integration: Adjusts routes dynamically based on live traffic updates and vehicle statuses.

Scalability

  • Microservices Architecture: Each component can scale independently.
  • Caching: Frequently requested routes and traffic data can be cached to reduce computation time.
  • Load Balancing: Distribute incoming requests across multiple instances of the Route Planning Service.

2. Real-time Vehicle Tracking Service

Description: This service tracks the real-time location of vehicles and updates their status.

Real-time Vehicle Tracking Service Architecture


Data Flow

  1. Vehicle Location Receiver: Receives GPS data from vehicles.
  2. Location Processor: Processes the GPS data to determine the vehicle’s current location.
  3. Status Updater: Updates the vehicle's status and next stop information in the database.
  4. Database: Stores and retrieves vehicle location data.
  5. Notification Engine: Sends alerts if there are significant deviations from the schedule.

Scalability

  • Streaming Processing: Handle incoming GPS data streams using tools like Apache Kafka.
  • Partitioning: Divide the processing load by vehicle type or geographical area.
  • High Availability: Ensure redundancy and failover mechanisms for continuous tracking.


Fare Collection Service

Description: This service processes fare payments, manages transactions, and ensures secure payment handling.


Data Flow

  1. Payment Request Handler: Receives payment requests from the API Gateway.
  2. Transaction Processor: Processes the payment and generates a transaction record.
  3. Payment Gateway Integrator: Communicates with external payment gateways to complete the transaction.
  4. Database: Stores transaction records and fare details.
  5. Security Module: Ensures secure handling of payment data, including encryption and fraud detection.

Scalability

  • Stateless Processing: Handle each payment request independently to enable horizontal scaling.
  • Secure Scaling: Use tokenization and encryption to securely scale the transaction processing.
  • Payment Gateway Load Balancing: Distribute payment requests across multiple payment gateways to avoid overloads.


Route Planning Service Sequence Diagram



Real-time Vehicle Tracking Service Sequence Diagram


Fare Collection Service Sequence Diagram


Trade offs/Tech choices

Database Choice: SQL vs. NoSQL

  • SQL Databases (e.g., PostgreSQL, MySQL):
  • Pros: ACID compliance, strong consistency, powerful query capabilities.
  • Cons: May become a bottleneck at scale, complex schema management.
  • NoSQL Databases (e.g., MongoDB, Cassandra):
  • Pros: High scalability, flexible schema, better performance for large datasets.
  • Cons: Weaker consistency, limited query capabilities.

Choice: SQL database for structured data (routes, schedules, fares) and NoSQL database for unstructured data (real-time vehicle tracking).


Real-time Data Processing: Stream vs. Batch

  • Stream Processing (e.g., Apache Kafka, Apache Flink):
  • Pros: Low latency, real-time insights, suitable for continuous data flow.
  • Cons: More complex to implement, higher operational overhead.
  • Batch Processing (e.g., Apache Hadoop, Apache Spark):
  • Pros: Simpler to implement, suitable for periodic analysis.
  • Cons: Higher latency, not suitable for real-time requirements.

Choice: Stream processing for real-time vehicle tracking and notifications.


Payment Processing: In-house vs. Third-party

  • In-house Payment Processing:
  • Pros: Full control over transactions, potential cost savings.
  • Cons: High complexity, security risks, regulatory compliance.
  • Third-party Payment Processing (e.g., Stripe, PayPal):
  • Pros: Simplified implementation, built-in security and compliance.
  • Cons: Transaction fees, dependency on external service.

Choice: Third-party payment processing for simplicity and security.


User Interface: Native vs. Web Apps

  • Native Apps:
  • Pros: Better performance, access to device features, offline capabilities.
  • Cons: Higher development and maintenance costs, platform-specific development.
  • Web Apps:
  • Pros: Cross-platform compatibility, easier to update and maintain.
  • Cons: Limited access to device features, may require internet connection.

Choice: Combination of both native and web apps to provide a seamless user experience across platforms.




Failure scenarios/bottlenecks

Real-time Data Processing Delays

  • Scenario: Delays in processing real-time data (e.g., vehicle location updates) due to high volume or processing bottlenecks.
  • Impact: Inaccurate or outdated information displayed to users, affecting route planning and vehicle tracking.
  • Mitigation:
  • Optimize data processing pipelines for low latency.
  • Use distributed stream processing frameworks (e.g., Apache Kafka, Apache Flink).
  • Implement horizontal scaling for data processing components.


Service Overload

  • Scenario: One or more services (e.g., Route Planning Service, Real-time Vehicle Tracking Service) are overwhelmed by a sudden surge in requests.
  • Impact: Increased response times or timeouts, leading to degraded user experience.
  • Mitigation:
  • Implement auto-scaling to dynamically adjust the number of service instances based on demand.
  • Use rate limiting and throttling to prevent abuse and ensure fair usage.
  • Monitor service health and implement circuit breakers to gracefully handle overloads.


Database Bottlenecks

  • Scenario: The database experiences high latency or downtime due to heavy traffic or hardware failure.
  • Impact: Slower response times for all services that rely on database queries, leading to degraded user experience.
  • Mitigation:
  • Use database replication and clustering to distribute load.
  • Implement read/write separation with read replicas.
  • Use caching layers (e.g., Redis) to reduce the load on the database.
  • Regularly monitor and optimize database performance.


API Gateway Failure

  • Scenario: The API Gateway becomes unresponsive or crashes.
  • Impact: All client applications are unable to communicate with backend services.
  • Mitigation:
  • Implement load balancing and failover strategies.
  • Use multiple instances of the API Gateway.
  • Monitor the health of the API Gateway and implement automatic restarts.





Future improvements

Expanded Geographic Coverage

  • Description: Extend the transportation network to cover more urban and suburban areas.
  • Benefit: Provide better service coverage and accessibility to more passengers.
  • Implementation: Plan and execute expansion projects based on demand and urban development plans.


Enhanced Accessibility Features

  • Description: Improve accessibility for passengers with disabilities by adding more features like real-time audio announcements, personalized navigation assistance, and better-designed interfaces.
  • Benefit: Make public transportation more inclusive and user-friendly for all passengers.
  • Implementation: Collaborate with accessibility experts and implement features based on user feedback and standards.


Dynamic Pricing Models

  • Description: Implement dynamic pricing based on demand, time of day, and distance traveled.
  • Benefit: Balance passenger load, encourage off-peak travel, and increase revenue.
  • Implementation: Use algorithms to adjust prices in real-time and update fare collection systems accordingly.


Enhanced Multi-modal Integration

  • Description: Further integrate with additional modes of transport such as bike-sharing, ride-hailing services, and electric scooters.
  • Benefit: Provide seamless end-to-end journeys for passengers, encouraging the use of public transport over private cars.
  • Implementation: Develop partnerships with various service providers and integrate their APIs into the system for unified planning and payment.