38

I'm new to DynamoDB (and to noSQL in general) and am struggling a little to get my head round some of the concepts. One thing in particular is giving me some problems, which is around querying a table based on a boolean key.

I realize that I can't create a primary or secondary index on a boolean key, but I can't see how I should ideally index and query a table with the following structure;

reportId: string (uuid)
reportText: string
isActive: boolean
category: string

I would like to be able to complete the following searches:

  1. Access a specific report directly (a primary hash index of reportId)
  2. List reports of a specific category (a primary hash index on category)

These are both straightforward, but I would like to perform two other queries;

  1. List all reports that are marked as isActive = true
  2. List all reports of a specific category that are marked as isActive = true

My first approach would be to create a primary hashkey index on isActive, with a rangekey on category, but I'm only able to choose String, Number of Binary as the attribute type.

Storing isActive as a string (saved as 'true' rather than a boolean true) solves the problem, but its horrible using a string for a boolean property.

Am I missing something? Is there a simple way to query the table directly on a boolean value?

1 Answer 1

58

My project includes this particular scenario and I've followed the DynamoDB best practice of using sparse indexes on both Local and Global Secondary Indexes. Here is what I would do with your example:

Table: reportId (string, hash key) || reportText (string) || isActive (string, marked as "x") || category (string)

ActiveReportsIndex (Local Secondary Index): reportID (hash key) || isActive (range key)

ActiveReportsByCategoryIndex (Global Secondary Index): category (hash key) || isActive (range key) || reportId

The idea behind sparse indexes is that only reports marked as isActive: "x" will show up in your indexes, so they should require less storage and processing than your main table. Instead of making the isActive attribute a boolean type, which will always store a true or false value, use a string like "x" or anything else you want when the report is active and DELETE the attribute completely when the report is not active. Makes sense?

UPDATE: If you want a specific kind of sort when you query (e.g. chronological), use a number (e.g. a UNIX timestamp) instead of an "x" string.

Sign up to request clarification or add additional context in comments.

11 Comments

This is perfect, and addresses it nicely. I hadn't seen the reference about sparse indexes (nor indeed the best practices link you included) but seems to be just what I need.
this does feel like a limitation on dynamo that you cant query based on boolean values, you have to deal with your own app logic to convert them to boolean types.
How would you create a query on the ActiveReportsIndex to retrieve all active reports? Wouldn't this require a scan on that index?
@SankethKatta yes, that's right. Retrieving all rows in any index or table requires a scan.
docs.aws.amazon.com/amazondynamodb/latest/developerguide/… According to this 1 would be 2 bytes and "X" would be 1 byte @readyornot 😊
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.