What is the difference between scan and query in dynamodb? When use scan / query?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Scan vs Query in DynamoDB: Understanding the Differences and When to Use Each
Amazon DynamoDB is a fully-managed NoSQL database service designed for fast, predictable performance and seamless scalability. When working with DynamoDB, developers need to retrieve data efficiently. Two core operations available for data retrieval in DynamoDB are Scan and Query. This article provides a detailed exploration of the differences between these operations and guidance on when to use each.
Scan Operation
The Scan operation examines every item in the specified DynamoDB table and by default returns all data attributes by matching the provided filters. Here are some technical aspects and examples to illustrate Scan's functionality:
- Full Table Read: Unlike a Query, Scan reads every item in the table, which can make it slower and more resource-intensive. Thus, it's best suited for applications where performance is secondary to obtaining every item that matches the criteria.
- Filters: You can provide filter expressions to limit the data retrieved by a Scan. This does not reduce the number of items that are read initially, but it reduces the number of items returned by the operation.
- Use Cases: Best used for analytics, reporting, or exporting entire datasets where a complete read-through is necessary. For example, obtaining a full export of data for an external backup or detailed analysis.
Example:
Query Operation
The Query operation uses the primary key to select and retrieve data from a DynamoDB table efficiently. Here are key technical points and usage scenarios for Query:
- Key-Based Retrieval: Query is optimized for fetching items by using the partition key (and optionally, the sort key). It goes directly to the location in the database based on the hash, making it faster and more efficient than Scan.
- Indexed Searches: Allows the use of secondary indexes to improve query performance and scope. While the partition key is obligatory, the sort key can be optional, facilitating greater flexibility.
- Filters: Like Scan, Query supports filter expressions, but the number of items read is minimized due to key-based access. The filtering is used after fetching the items using primary or secondary indexes.
- Use Cases: Ideal for real-time applications where quick, efficient data retrieval is crucial. For example, fetching all orders made by a specific customer using their unique customer ID as the partition key.
Example:
Key Differences
Let's summarize the primary differences between Scan and Query operations in the table below:
| Feature | Scan | Query |
| Key Requirement | No key required | Requires a partition key (and optionally a sort key) |
| Performance | Slower, reads entire table | Faster, only reads necessary partitions |
| Use of Index | Not applicable | Can utilize secondary indexes for performance |
| Item Limit | Potentially large results | Returns a scoped result set designed to reduce data volume |
| Filter Usage | Filters applied after read | Filters applied after key-based selection |
| Primary Use Cases | Analytics, backups | Real-time application queries |
Additional Considerations
- Provisioned Throughput: Both Scan and Query operations consume read capacity units, but Scan operations can dramatically consume more resources if tables are large. It's important to adjust your capacity settings to accommodate heavier loads when using Scan.
- Pagination: Both operations return paginated results. By default, a single call will return up to 1 MB of data, or less, if no more data is encountered. Always handle pagination by iterating over the pages when the dataset size is uncertain.
- Efficient Use of Query: By structuring your table schema to leverage queries instead of scans, you can minimize costs and maximize performance. Always ensure your design allows for the effective use of partition and sort keys to facilitate quick read patterns.
By understanding the distinct characteristics and appropriate use cases for both Scan and Query operations, you can design your DynamoDB interactions to optimize performance and cost-effectiveness.

