0

I am very new to python and I have an issue with the read_sql_table part of pandas. If I simply provide the table name and the engine from sqlalchemy, it reads the data and I am able to print the head of the dataframe. If I add index_col and columns, it works as well. As soon as I add CHUNKSIZE as 10000, it fails to print the head with the error 'generator' object has no attribute 'head'

1
  • Please also post a snippet of what all you have tried till now. Commented Jan 24, 2020 at 2:13

2 Answers 2

1

Just iterate over it:

for chunk in pd.read_sql_table(table_name, chunksize=10000):
    print(chunk.head())
Sign up to request clarification or add additional context in comments.

Comments

1

A generator in Python is a way to lazily evaluate things.
So there simply isn't anything to get the .head() of when you provide input to the chunksize keyword argument.

What you'll need to become familiar with is iterating over those results.
Example:

generator_object = pd.read_sql_table('your_table',con=your_connection_string,
                                     chunksize=CHUNKSIZE)
for chunk in generator_object:
    print(chunk)

Another thing you can do is to request the first chunk of your table with next():

generator_object = pd.read_sql_table('your_table',con=your_connection_string,
                                     chunksize=CHUNKSIZE)
next(generator_object).head()

But please note that this consumes the chunk, and generator_object will no longer return that chunk.

Further reading:
You can also get multiple chunks using itertools.islice:

import itertools as it

CHUNKSIZE = 10
iterable_slice = it.islice(generator_object,3) # get 3*10 == 30 records

for chunk in iterable_slice:
    print(chunk)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.