2

I want to use SQLAlchemy and Pandas to read a table from a Posgresql database into a Pandas dataframe using read_sql_table(). The SQL query to the database is similar to this:

SELECT col1,col2 FROM my_table WHERE col1=='value'

I tried this code to get the Pandas dataframe from the table:

from sqlalchemy import create_engine
db_uri = environ.get('SQLALCHEMY_DATABASE_URI')
engine = create_engine(db_uri, echo=True)

table_df = pd.read_sql_table(
'my_table',
con=engine,
schema="public"
columns=['col1','col2'])

This code works but how can I apply the condition, similar to WHERE in the SQL query, and filter the dataframe based on that condition? I don't want to load the dataframe first in memory, I want to do it while querying the database.

3
  • Use read_sql_query() and pass it the SELECT statement that includes the WHERE clause. Commented Jun 14, 2021 at 16:20
  • @GordThompson thanks! I ended up using read_sql_query. however I am still interested in a way to avoid writing any SQL code. Commented Jun 15, 2021 at 7:53
  • 1
    Do you mean creating a query using SQLAlchemy's SQL Expression language, e.g., qry = team.select().where(team.c.id == 1) and then passing it to pd.read_sql_query(qry, engine)? Yes, you can do that. Commented Jun 15, 2021 at 11:52

2 Answers 2

1

As mentioned in a comment to the question, you can use read_sql_query() to filter your results. If you want to avoid passing a raw SQL statement to the function you can create the query using SQLAlchemy Core and pass that instead:

import sqlalchemy as sa

# …

team = sa.Table("team", sa.MetaData(), autoload_with=engine)
qry = sa.select(team.c.city, team.c.name).where(team.c.id == 1)
df = pd.read_sql_query(qry, engine)
print(df)
"""
      city    name
0  Calgary  Flames
"""
Sign up to request clarification or add additional context in comments.

Comments

0
import pandas as pd
import sqlalchemy as sa
engine = sa.create_engine('oracle+cx_oracle://user:senha@db', echo=False)

team = sa.Table('oracle_table', sa.MetaData(), autoload_with=engine, schema='db')
qry = sa.select(team.c.column_a, team.c.column_b).where(
                team.c.column_b == 'OPTION')
df = pd.read_sql_query(qry, engine)
print(df)

engine.dispose()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.