8

I am running the exact same query both through pandas' read_sql and through an external app (DbVisualizer).

DbVisualizer returns 206 rows, while pandas returns 178.

I have tried reading the data from pandas by chucks based on the information provided at How to create a large pandas dataframe from an sql query without running out of memory?, it didn't make a change.

What could be the cause for this and ways to remedy it?

The query:

select *
from rainy_days
where year=’2010’ and day=‘weekend’

The columns contain: date, year, weekday, amount of rain at that day, temperature, geo_location (row per location), wind measurements, amount of rain the day before, etc..

The exact python code (minus connection details) is:

import pandas
from sqlalchemy import create_engine

engine = create_engine(
   'postgresql://user:[email protected]/weatherhist?port=5439',
)

query = """
        select *
        from rainy_days
        where year=’2010’ and day=‘weekend’
        """
df = pandas.read_sql(query, con=engine)
3
  • 2
    You are using strange quotes (for the year=’2010’), I don't know if that could be a cause, but can you replace them with normal single quotes? (') Commented Mar 8, 2016 at 10:58
  • 1
    is there a solution to this? I'm running the same issue. Commented Jan 11, 2017 at 1:01
  • 1
    same issue. I have a table with total 7 rows, pandas.read_sql_table get 7 but pandas.read_sql get 5 rows. Commented Mar 26, 2021 at 6:20

3 Answers 3

-1

It's not a fix, but what worked for me was to rebuild the indices:

  1. drop the indices

  2. export the whole thing to a csv:

  3. delete all the rows:

    DELETE FROM table

  4. import the csv back in

  5. rebuild the indices

pandas:

df = read_csv(..)
df.to_sql(..)

If that works, then at least you know you have a problem somewhere with the indices keeping up to date.

Sign up to request clarification or add additional context in comments.

1 Comment

the strange quotes `` are used in SQL to distinguish field names from reserved words, e.g. SELECT `right` FROM ...
-2

https://github.com/xzkostyan/clickhouse-sqlalchemy/issues/14

If you use pure engine.execute you should care about format manually

Comments

-3

The problem is that pandas returns a packed dataframe (DF). For some reason this is always on by default and the results varies widely as to what is shown. The solution is to use the unpacking operator (*) before/when trying to print the df, like this:

print(*df)

(This is also know as the splat operator for Ruby enthusiasts.)

To read more about this, please check out these references & tutorials:

1 Comment

If you down-vote, at least have the courtesy of adding a comment. This worked for me and was the working solution for me, at the time!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.