DynamoDB
BatchGetItem
Query
AWS
Database Operations

What's the difference between BatchGetItem and Query in DynamoDB?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

DynamoDB, developed by Amazon, is a fully-managed, serverless, key-value NoSQL database designed to handle workloads of any size. As part of its robust API, DynamoDB provides several operations for reading data. Among these operations, BatchGetItem and Query are two commonly used methods for retrieving data, each with a unique purpose and use case. This article explores the differences between these two operations, providing technical explanations and examples to illuminate their respective functionalities.

Key Differences between BatchGetItem and Query

1. Purpose and Use Case

  • BatchGetItem: This operation is primarily used for retrieving multiple items from one or more tables in a single request. It is highly efficient for scenarios where you know the exact primary keys of the items you need. BatchGetItem is optimal when accessing multiple items that are not necessarily part of the same partition key.
  • Query: The Query operation, on the other hand, is designed to retrieve all items that share the same partition key. It allows more fine-tuned querying, such as using a sort key, and can filter expressions to narrow down the results. This operation is ideal for scenarios where you need to retrieve items that share a logical relationship in a single table.

2. API and Parameters

  • BatchGetItem:
    • Accepts multiple sets of primary keys to retrieve items from one or more tables.
    • Requests parameter lists table names and a set of keys to retrieve for each table.
    • Example in pseudo-code:
json
1    {
2      "RequestItems": {
3        "Table1": {
4          "Keys": [
5            {"PrimaryKey": {"S": "Value1"}},
6            {"PrimaryKey": {"S": "Value2"}}
7          ]
8        },
9        "Table2": {
10          "Keys": [
11            {"Id": {"N": "101"}},
12            {"Id": {"N": "102"}}
13          ]
14        }
15      }
16    }
  • Query:
    • Designed to work on a single table at a time.
    • Requires specifying a partition key value; optionally, it can use a sort key condition, projection expression, or filer expression.
    • Example in pseudo-code:
json
1    {
2      "TableName": "Table1",
3      "KeyConditionExpression": "PartitionKey = :partitionValue",
4      "ExpressionAttributeValues": {
5        ":partitionValue": {"S": "desiredValue"}
6      }
7    }

3. Limitations and Constraints

  • BatchGetItem:
    • Cannot retrieve items from a Global Secondary Index (GSI).
    • The response size limit is 16 MB.
    • Maximum of 100 items requested in a single batch.
    • Fails if any request exceeds provisioned throughput.
  • Query:
    • Can target a specific partition or use an index (like a GSI or Local Secondary Index).
    • Response size limit is 1 MB by default, after which pagination can be used.
    • Can only query one table per request.

4. Throughput Consumption

  • BatchGetItem:
    • Consumes read capacity units based on the number of eventually consistent or strongly consistent reads.
    • More efficient for reading multiple non-contiguous items.
  • Query:
    • Consumes read capacity units based on the amount of data read, including any wasted on irrelevant attributes unless filtered.

Practical Example

Let's consider a scenario where we have a table Customers:

  • BatchGetItem use case: Suppose you want to fetch the details of customers with the following IDs: 101, 102, and 103. Using BatchGetItem is efficient here as you already know the exact IDs you wish to retrieve.
  • Query use case: If you want to fetch all orders placed by a customer on a particular date for analytical purposes, using Query provides more flexibility, especially if your table has a partition key (customer ID) and a sort key (order date).

Summary Table

Feature/AspectBatchGetItemQuery
Primary UseRetrieve multiple items across partitionsRetrieve items sharing a partition key
TargetsMultiple tablesSingle table
Required InformationPrimary keys of itemsPartition key (and optionally, sort key)
Index SupportNoYes
Request Size Limit16 MB, max 100 items1 MB
Read EfficiencyBest for known keysBest for related data with partition key
Use Case ScenarioFetch non-contiguous keysFetch related items with common attributes
LimitationsCannot use GSIs, fails on exceeded throughputCannot span multiple tables

Additional Considerations

  • Pagination: When dealing with large datasets, Query supports pagination if the response exceeds the 1 MB limit, using the LastEvaluatedKey.
  • Cost Implications: Since both operations consume provisioned throughput, optimizing your configuration by choosing the appropriate read consistency (eventual vs. strong) and understanding your data access patterns is crucial for cost management.

In conclusion, choosing between BatchGetItem and Query heavily depends on the nature of your application's data access patterns. Understanding their capabilities and limitations ensures optimal performance and efficient resource utilization in DynamoDB.


Course illustration
Course illustration

All Rights Reserved.