5

I have written following python code to fetch data from a table but its not fetching all the items as I want. When I check on AWS console page of DynamoDb, I can see much more entries as compared to what I get from script.

from __future__ import print_function # Python 2/3 compatibility
import boto3
import json
import decimal
from datetime import datetime
from boto3.dynamodb.conditions import Key, Attr
import sys

# Helper class to convert a DynamoDB item to JSON.
class DecimalEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, decimal.Decimal):
            if o % 1 > 0:
                return float(o)
            else:
                return int(o)
        return super(DecimalEncoder, self).default(o)

dynamodb = boto3.resource('dynamodb', aws_access_key_id = '',
        aws_secret_access_key = '',
        region_name='eu-west-1', endpoint_url="http://dynamodb.eu-west-1.amazonaws.com")

mplaceId = int(sys.argv[1])
table = dynamodb.Table('XYZ')

response = table.query(
    KeyConditionExpression=Key('mplaceId').eq(mplaceId)
)

print('Number of entries found ', len(response['Items']))

I did the same thing from aws console also. Query by mplaceId.

Any reason why its happening?

1
  • 1
    DynamoDB API returns 1 MB data only. If there is more data, DDB paginates it. If LastEvaluatedKey is present in the response, you will need to paginate the result set. Documentation can be found here: boto3.readthedocs.io/en/latest/reference/services/… Commented Aug 3, 2018 at 22:07

1 Answer 1

10

dynamodb.Table.query() returns at max 1MB of data. From the boto3 documentation:

A single Query operation will read up to the maximum number of items set (if using the Limit parameter) or a maximum of 1 MB of data and then apply any filtering to the results using FilterExpression. If LastEvaluatedKey is present in the response, you will need to paginate the result set. For more information, see Paginating the Results in the Amazon DynamoDB Developer Guide .

That's actually no boto3-limitation, but a limitation of the underlying query-API.

Instead of implementing pagination yourself, you can use boto3's built-in pagination . Here is an example showing the use of the paginator for querying DynamoDB tables provided by boto3:

import boto3
from boto3.dynamodb.conditions import Key

dynamodb_client = boto3.client('dynamodb')
paginator = dynamodb_client.get_paginator('query')
page_iterator = paginator.paginate(
    TableName='XYZ',
    KeyConditionExpression='mplaceId = :mplaceId',
    ExpressionAttributeValues={':mplaceId': {'S' : mplaceid}}
)

for page in page_iterator:
    print(page['Items'])
Sign up to request clarification or add additional context in comments.

4 Comments

The 1MB size limit also applies to table Scans. This was a valid question since the common behavior for a datastore is to set a return limit to the number or records, ESPECIALLY when the limit number is stated explicitly. See stackoverflow.com/questions/46617575/…
The thing is this has KeyConditionExpress, doesn't return all results, searching for a way to return all results without condition or regardless of condition @Dunedan
As far as I can tell the paginate method does not support using the Key or Attr expression objects, meaning it's not a drop-in replacement, which forces you have to build a string query manually. Also, your example doesn't include the TableName parameter.
@AlastairMcCormack: Thanks for your comment. I edited my answer to fix the mentioned issues.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.