17

I'm trying to store a mySQL query result in a pandas DataFrame using pymysql and am running into errors building the dataframe. Found a similar question here and here, but it looks like there are pymysql-specific errors being thrown:

import pandas as pd
import datetime
import pymysql

# dummy values 
connection = pymysql.connect(user='username', password='password', databse='database_name', host='host')

start_date = datetime.datetime(2017,11,15)
end_date = datetime.datetime(2017,11,16)

try:
    with connection.cursor() as cursor:
    query = "SELECT * FROM orders WHERE date_time BETWEEN %s AND %s"

    cursor.execute(query, (start_date, end_date)) 

    df = pd.DataFrame(data=cursor.fetchall(), index = None, columns = cursor.keys())
finally:
    connection.close()

returns: AttributeError: 'Cursor' object has no attribute 'keys'

If I drop the index and columns arguments:

try:
    with connection.cursor() as cursor:
    query = "SELECT * FROM orders WHERE date_time BETWEEN %s AND %s"

    cursor.execute(query, (start_date, end_date)) 

    df = pd.DataFrame(cursor.fetchall())
finally:
    connection.close()

returns ValueError: DataFrame constructor not properly called!

Thanks in advance!

3 Answers 3

38

Use Pandas.read_sql() for this:

query = "SELECT * FROM orders WHERE date_time BETWEEN ? AND ?"
df = pd.read_sql(query, connection,  params=(start_date, end_date))
Sign up to request clarification or add additional context in comments.

3 Comments

pandas.read_sql() works well in general. However, what if the command is to execute a stored procedure which needs to update a table (and commit the updated rows of the table)? In that case, how to ensure the commit is done? Does pd.read_sql allow commits?
@Nodame, you can first call your stored procedure using sqlalchemy and then read results using pd.read_sql
How does a pymysql conmection work with pandas read_sql since pymysql Connection doesnt inherit from any schlalchemy class?
2

Try This:

import pandas as pd
import pymysql

mysql_connection = pymysql.connect(host='localhost', user='root', password='', db='test', charset='utf8')
                    
sql = "SELECT * FROM `brands`"
df = pd.read_sql(sql, mysql_connection, index_col='brand_id')
print(df)

Comments

1

Thank you for your suggestion to use pandas.read_sql(). It works with executing a stored procedure as well! I tested it in MSSQL 2017 environment.

Below is an example (I hope it helps others):

def database_query_to_df(connection, stored_proc, start_date, end_date):
    # Define a query
    query ="SET NOCOUNT ON; EXEC " + stored_proc + " ?, ? " + "; SET NOCOUNT OFF"

    # Pass the parameters to the query, execute it, and store the results in a data frame
    df = pd.read_sql(query, connection, params=(start_date, end_date))
    return df

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.