6

I need to execute an in query on the key attribute. Since, query doesn't provide in condition, I am planning to use scan. Will scan on key attribute scan the entire table?

1

2 Answers 2

6

Will SCAN on key attribute scan the entire table?

Yes, see Query and Scan in Amazon DynamoDB:

Scan

A scan operation scans the entire table. You can specify filters to apply to the results to refine the values returned to you, after the complete scan. Amazon DynamoDB puts a 1MB limit on the scan (the limit applies before the results are filtered). A scan can result in no table data meeting the filter criteria.

Specifically, there is no difference between key and non key attributes as far as the Scan API is concerned, i.e. you simply provide the desired attributes by name, regardless of them being used as an attribute constituting the Primary Key as well or not:

AttributesToGet

Array of Attribute names. If attribute names are not specified then all attributes will be returned. If some attributes are not found, they will not appear in the result.

Sign up to request clarification or add additional context in comments.

2 Comments

Too bad. What would be the cost difference between scan and query for 100 matching records on a million recordset?
@Mani: Most likely enormous: SCAN just isn't designed to be used that way and has significant implications on cost/performance for huge recordsets, if used without accounting for this specifically - calculating this is rather complex as well accordingly, please read through Chris Moyer's first blog post on Amazon DynamoDB for an analysis, incidentally a sample for a million recordset, and thoughts on how to account for this problem in turn.
1

wouldn't batchGetItem work for you?

3 Comments

Actually, the entity has a key and a range attribute. So it is an IN and between condition on key attributes. BTW, how do you use batchgetitem with pojo classes?
@ManiDoraisamy given the enhanced requirement I would suggest issuing a query per hash key value in the IN list. write up a small app which joins those query results and runs in amazon's EC2 and you should alleviate the increase in latency. alternatively you can give this latter task to amazon's EMR since Amazon DynamoDB also integrates with Amazon Elastic MapReduce.
@ManiDoraisamy to your second question the answer is you don't. BatchGetItemRequest nails down to a list of up to 100 Keys. the Key contains AttributeValue for both hash key and range key to precisely identify a record in a table. btw, a record is essentially a Map and it is your responsibility to convert your pojos in and out of those.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.