Differences between GSI and table
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Databases have become an integral part of modern applications, providing organized ways to store, manage, and retrieve data. Within the context of NoSQL databases like Amazon DynamoDB, one often encounters concepts like "tables" and "Global Secondary Indexes (GSI)." Understanding their differences, functionalities, and use-cases can significantly optimize data management and retrieval in modern applications.
Key Concepts: Tables vs. GSI
Before delving into the differences, let's clarify the primary functions and definitions of a Table and a GSI.
Tables
A table in a database is a collection of related data entries. In relational databases, a table is often defined by columns representing attributes of data and rows representing individual records. In NoSQL databases like DynamoDB, tables are designed to be flexible, allowing for schema-less storage, where each item can have different attributes, although typically sharing a partition key.
Global Secondary Index (GSI)
A Global Secondary Index in DynamoDB is a powerful feature that allows the creation of alternate query patterns. GSI provides a way to retrieve data on non-key attributes with quick access performance. It essentially enables the overlay of additional, differently-keyed views on the same dataset, which can be crucial for query flexibility and optimizing application performance.
Technical Differences
Below are the technical distinctions between tables and GSIs when using a system like DynamoDB:
| Aspect | Table | Global Secondary Index (GSI) |
| Definition | Main storage structure holding raw data | Secondary structure for querying data on non-primary key attributes |
| Primary Key | Contains both partition key and sort key | Defined separately, and can differ from the table’s primary key |
| Schema Flexibility | Schema-less, items can differ in attributes | Inherits flexibility but attributes used in GSI must match table attributes |
| Usage | For CRUD operations on main data entries | For querying and reading data with alternate keys |
| Performance | Dependent on partition key distribution | Performance affects both table and index provisioned throughput |
| Capacity & Billing | Charged based on read/write capacity units | Additional charges based on additional read/write operations |
| Data Consistency | On writes, consistent across items | Eventual consistency in indexes from replication delay |
| Operational Complexity | Simple, manages primary data | Adds complexity needing careful planning for query patterns |
Exploring Use Cases
When to Use a Table
Tables are used to manage complete datasets and are best when:
- The access pattern primarily revolves around the data's primary key.
- The application requires a primary storage structure for executing CRUD operations.
- Flexibility is needed in defining item attributes without pre-configuring column constraints.
When to Use GSIs
GSIs are beneficial when:
- There is a need to query datasets on non-primary attributes.
- The application design anticipates multiple retrieval requirements not supported by the primary index.
- There's a vision to enhance data access patterns for better performance without extensive data normalization.
Example Scenario
Assume we have a "Users" table:
We might want to query users by their "Email" attribute instead of the "UserId." A GSI can be created:
With , you can now efficiently query users by email.
Maintaining and Managing GSIs
While providing enhanced querying capabilities, managing GSIs involves thoughtful planning:
- Provisioned Throughput: Allocating throughput to GSIs is crucial as it directly impacts read and write capacity.
- Data Consistency: Understanding eventual consistency is important, as changes in tables may take time to reflect in GSIs.
- Cost Implications: GSIs involve additional costs tied to their read/write capacity, demanding strategic budgeting.
Conclusion
Understanding when to use tables versus GSIs can significantly influence application design and performance. While tables serve as the foundational framework for storing data, GSIs enhance the flexibility and efficiency of data retrieval operations. Thoughtful use of GSIs, considering their cost and complexity, can enrich application functionalities and user experience. Embracing these differences empowers database architects and developers to optimize data architecture effectively.

