AWS DynamoDB Query based on non-primary keys
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Understanding AWS DynamoDB Query Based on Non-Primary Keys
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. One of the critical components of effectively using DynamoDB is understanding how to query data efficiently, especially when the query is based on non-primary keys. This article will explore querying methods in DynamoDB, focusing specifically on non-primary keys, such as secondary indexes, and provide technical insights and examples.
Primary Key Basics in DynamoDB
DynamoDB tables store items with a primary key as the main identifier. The primary key can be:
- Partition Key: A single attribute (e.g., a user ID) is the sole primary key. This is a simple primary key.
- Composite Primary Key: A combination of a partition key and a sort key (e.g., user ID and timestamp).
Queries based directly on these keys are straightforward. However, in real-world applications, there may be a need to query data based on non-primary fields, which requires secondary indexes.
Secondary Indexes
Secondary indexes allow you to query data on attributes other than the primary key. There are two types of secondary indexes in DynamoDB:
- Local Secondary Index (LSI):
- LSI has the same partition key as the original table, but a different sort key.
- Useful for querying data with the same partition key in different sort orders.
- Limited to 5 per table.
- Created at the table's creation and cannot be modified or added later.
- Global Secondary Index (GSI):
- A completely different partition and sort key from the table.
- Allows querying on any attribute.
- More flexible than LSI and can be created at any time.
- Limited to 20 GSIs per table.
Querying with Non-Primary Keys
When you need to query data using non-primary keys, you will typically rely on GSIs due to their flexibility.
Example Scenario
Suppose you have a "CustomerOrders" table with the following attributes:
CustomerID(Partition Key)OrderID(Sort Key)OrderDateTotalAmountProductCategory
You want to query all orders within a specific ProductCategory.
Creating a Global Secondary Index
First, you'll need to create a GSI. Suppose you define a GSI with the following attributes:
- Partition Key:
ProductCategory - Sort Key:
OrderDate
This index allows you to query data based on the product category while optionally sorting the results by the order date.
Query Example in Python (Boto3)
Considerations and Best Practices
- Index Provisioning: GSIs consume additional resources. Ensure that you provision sufficient read/write capacity to support your index-based queries without throttling.
- Read Consistency: Queries on GSIs are eventually consistent unless configured as strongly consistent reads.
- Query Efficiency: Design GSIs to minimize the number of scanned items for enhanced performance. Avoid fetching large datasets.
- Cost Management: As GSIs increase table size, optimize their use to avoid unexpectedly high costs.
- Attribute Usage: Only project attributes in the index that are necessary for query purposes to reduce storage costs.
Key Takeaways
To summarize, leveraging non-primary keys through secondary indexes like GSIs provides flexibility and efficiency in querying data. Here's a summary table:
| Feature | Primary Key | Local Secondary Index (LSI) | Global Secondary Index (GSI) |
| Partition Key | Defined by table | Same as table | Different from the table |
| Sort Key | Optional | Different from table | Different from the table |
| Use Cases | Direct key lookup | Query within same partition for diverse sort orders | Query based on non-primary attributes |
| Capacity | Table-wide limit | Consumes read units from table | Requires additional provisioning |
| Consistency | Strong or eventual | Strong or eventual | Eventually consistent by default |
| Flexibility | Fixed attributes | Created with table, non-modifiable | Can be created/modified any time |
| Limit | N/A | Up to 5 per table | Up to 20 per table |
With these insights, you can optimize your DynamoDB usage for complex query scenarios while adhering to best practices. Remember that thoughtful index design and capacity planning are essential to achieving optimal database performance and cost management.

