2

I have a Java-based backend app which issues a getItem request to my DynamoDB table based on a request from my user. Occasionally my user sends a request to my app which ends up sending a getItem request to DynamoDB which hits the maximum size limit of the key below (quote from https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.NamingRulesDataTypes.html ):

For a simple primary key, the maximum length of the first attribute value (the partition key) is 2048 bytes.

I get this error from the DynamoDB SDK when that happens: One or more parameter values were invalid: Size of hashkey has exceeded the maximum size limit of 2048 bytes

Now I need to implement a validation for this situation where my user sends a request which hits the limit to avoid the error. My question here is: what is the right way to implement this validation in my app? Judging by the documentation linked above, DynamoDB seems to be using UTF-8 internally, so will something like below be fine?

boolean isPartitionKeySizeValid(String partitionKey) {
    int size = partitionKey.getBytes(StandardCharsets.UTF_8).length;
    return 1 <= size && size <= 2048;
}

My app uses the com.amazonaws:aws-java-sdk-dynamodb library to interact with DynamoDB.

2
  • Why are your clients generating such large partition keys? Perhaps you should address that issue. Secondly, why not simply catch this DynamoDB error when it happens and report a relevant error back to the client? Commented Apr 7, 2023 at 12:29
  • The key can come from public internet. I cannot control what a malicious user does. Also it’s not so easy to tell what exactly the problem was from the error from DynamoDB. I don’t want to parse the error message string for example. Commented Apr 7, 2023 at 13:33

2 Answers 2

1

Yes simply counting the byte length will allow you to avoid hitting the 2KB partition key value limit.

Sign up to request clarification or add additional context in comments.

Comments

1

If your table does not use a sort key, the 2048 bytes should fit in the length of the partition-key name, in addition to the UTF-8-encoded value. For a composite key, the length is limited to 1024.

2 Comments

IMO that kind of change "for performance sake" is unlikely to have any major effect, especially given that getBytes(StandardCharsets.UTF_8) will be an orders of magnitudes more expensive operation! If performance is an issue in this method (and that's a pretty big IF), then optimizing to avoid that call if possible would be better. For example if the string length * 4 (or 6, if feeling especially pessimistic) is less than 2048, then the UTF-8 encoded form is guaranteed to fit. Or even calculate the length without doing the encoding.
@JoachimSauer, I agree with every statement you make, however, given the amount of information provided in the question, the maximum entropy demands that size > 2048 and size < 1 are equally likely. Then, given the fact that the author does not evaluate for size < 1 early, but goes straight for the expensive operation of UTF conversion, the equal likelihood conclusion is tipped towards size > 2048 being more likely. Given the information at hand, evaluating this condition first provides efficiency gain. However, since this is not related to original question, I'll remove this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.