1

I have to get the items from a table similar to what I have provided below where the Result and Accuracy don't exist or Null. The table is having a lot of data and it is growing daily. I tried to get the items from the dynamo DB using the scan. But it is taking a lot of time to show the result. Can anyone suggest me how to run the query or scan which will provide me the result faster?

Dynamo DB Table(Sample):

|ID(primary)|date(String)|Result(number)|Accuracy(Number)|
|   edfca   |   20180101 |    12        |     59.99      |
|    yrdfa  |  20180109  |    72        |     91.49      |
| arfead    | 20180101   |    122       |     32.12      |
|  readssda | 20180112   |              |                |
| geeddar   | 20180123   |    3         |     81.11      | 
|  rogeasa  |  20180203  |              |                |
| swiwads   | 20180205   |     32       |      13.12     |

Currently, I'm using the following

java:

ScanSpec scanSpec = new ScanSpec();
scanSpec.withFilterExpression("attribute_not_exists(accuracy)");
scanSpec.setMaxResultSize(100);
table.scan(scanSpec);

The above scan operation is very slow.

I'm new to dynamo DB and I googled it and found it's better to use a query than using the scan operation.

Can I use the query operation for getting the data I need? If yes please provide an example

2
  • Possible duplicate of How do you query for a non-existent (null) attribute in DynamoDB Commented Feb 9, 2018 at 11:44
  • Not a duplicate question. I want to solve it using the query operation instead of the Scan operation as scan operation is very slow. I'm looking for other possible ways to get the items faster from the table Commented Feb 9, 2018 at 11:48

1 Answer 1

2

When you do a query you have to specify the Partition Key. Given your table, assuming you have not defined any secondary index, you can only query for one partition key. Useless.

Should you really want to improve the performance I can imagine a solution. When you write items in your table, define a column hasNull: string (for example) and set it to 'y' for all rows with fills the condition. Let it undefined for all other items.

Later, define a Global Secondary Index named MY_GSI_NAME:

  • Partition Key: hasNull
  • Sorting Key: ID
  • Projection type (?): KEYS_ONLY (or ALL, see dynamodb docs)

In node.js you may query your table with:

const params = {
  TableName: 'MY_TABLE_NAME',
  IndexName: 'MY_GSI_NAME',
  KeyConditionExpression: '#pk = :pk',
  ExpressionAttributeNames: {
    '#pk': 'hasNull',
  },
  ExpressionAttributeValues: {
    ':pk': 'y',
  },
};

return MY_DOCUMENT_CLIENT.query(params).promise();

Doing that you'll get all the items with hasNull === 'y'.

That query has the advantage that only the items you are interested in are scanned since the index is sparse (hasNull is missing for the items you do not need).

Hope this give you a starting point to solve your problem. Ask more to know more :)

Bonne chance!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.