AWS DynamoDB
non-primary key queries
NoSQL database
data retrieval
cloud computing

AWS DynamoDB Query based on non-primary keys

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Understanding AWS DynamoDB Query Based on Non-Primary Keys

Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. One of the critical components of effectively using DynamoDB is understanding how to query data efficiently, especially when the query is based on non-primary keys. This article will explore querying methods in DynamoDB, focusing specifically on non-primary keys, such as secondary indexes, and provide technical insights and examples.

Primary Key Basics in DynamoDB

DynamoDB tables store items with a primary key as the main identifier. The primary key can be:

  • Partition Key: A single attribute (e.g., a user ID) is the sole primary key. This is a simple primary key.
  • Composite Primary Key: A combination of a partition key and a sort key (e.g., user ID and timestamp).

Queries based directly on these keys are straightforward. However, in real-world applications, there may be a need to query data based on non-primary fields, which requires secondary indexes.

Secondary Indexes

Secondary indexes allow you to query data on attributes other than the primary key. There are two types of secondary indexes in DynamoDB:

  1. Local Secondary Index (LSI):
    • LSI has the same partition key as the original table, but a different sort key.
    • Useful for querying data with the same partition key in different sort orders.
    • Limited to 5 per table.
    • Created at the table's creation and cannot be modified or added later.
  2. Global Secondary Index (GSI):
    • A completely different partition and sort key from the table.
    • Allows querying on any attribute.
    • More flexible than LSI and can be created at any time.
    • Limited to 20 GSIs per table.

Querying with Non-Primary Keys

When you need to query data using non-primary keys, you will typically rely on GSIs due to their flexibility.

Example Scenario

Suppose you have a "CustomerOrders" table with the following attributes:

  • CustomerID (Partition Key)
  • OrderID (Sort Key)
  • OrderDate
  • TotalAmount
  • ProductCategory

You want to query all orders within a specific ProductCategory.

Creating a Global Secondary Index

First, you'll need to create a GSI. Suppose you define a GSI with the following attributes:

  • Partition Key: ProductCategory
  • Sort Key: OrderDate

This index allows you to query data based on the product category while optionally sorting the results by the order date.

Query Example in Python (Boto3)

python
1import boto3
2
3# Initialize a session using Amazon DynamoDB
4dynamodb = boto3.resource('dynamodb')
5
6# Select your DynamoDB table
7table = dynamodb.Table('CustomerOrders')
8
9# Perform a query using the Global Secondary Index
10response = table.query(
11    IndexName='ProductCategory-OrderDate-Index',
12    KeyConditionExpression=Key('ProductCategory').eq('Electronics')
13)
14
15for item in response['Items']:
16    print(item)

Considerations and Best Practices

  • Index Provisioning: GSIs consume additional resources. Ensure that you provision sufficient read/write capacity to support your index-based queries without throttling.
  • Read Consistency: Queries on GSIs are eventually consistent unless configured as strongly consistent reads.
  • Query Efficiency: Design GSIs to minimize the number of scanned items for enhanced performance. Avoid fetching large datasets.
  • Cost Management: As GSIs increase table size, optimize their use to avoid unexpectedly high costs.
  • Attribute Usage: Only project attributes in the index that are necessary for query purposes to reduce storage costs.

Key Takeaways

To summarize, leveraging non-primary keys through secondary indexes like GSIs provides flexibility and efficiency in querying data. Here's a summary table:

FeaturePrimary KeyLocal Secondary Index (LSI)Global Secondary Index (GSI)
Partition KeyDefined by tableSame as tableDifferent from the table
Sort KeyOptionalDifferent from tableDifferent from the table
Use CasesDirect key lookupQuery within same partition for diverse sort ordersQuery based on non-primary attributes
CapacityTable-wide limitConsumes read units from tableRequires additional provisioning
ConsistencyStrong or eventualStrong or eventualEventually consistent by default
FlexibilityFixed attributesCreated with table, non-modifiableCan be created/modified any time
LimitN/AUp to 5 per tableUp to 20 per table

With these insights, you can optimize your DynamoDB usage for complex query scenarios while adhering to best practices. Remember that thoughtful index design and capacity planning are essential to achieving optimal database performance and cost management.


Course illustration
Course illustration

All Rights Reserved.