0

I have a SQL database from which I need to analyze a very big table. I would like to use python for that.

Unfortunately I cannot access the SQL database directly via python.

Are there any suggestions to what to export the table to be able to work with it?

  1. I tried to export to SQL file and to import that into pandas dataframe but of course out of memory.

  2. I tried to access the database from python directly using pymysql using

     db=pymysql.connect(host="localhost", db="all_data")
    

but I get "can't connect to MySQL server Win Error 10061"?

File type is .sql.

Many thanks

6
  • 1
    Try to export as CSV or similar than load into Pandas Commented Jul 30, 2020 at 14:03
  • What is the database type? Commented Jul 30, 2020 at 14:10
  • it is a .sql file Commented Jul 30, 2020 at 15:07
  • No, apologies. What type of database is it; Oracle, MySQL, SQLServer, etc.? Each has its own method for exporting data, some easier to work with than others. Commented Jul 30, 2020 at 15:21
  • ah sorry, mysql Commented Jul 30, 2020 at 15:26

2 Answers 2

2

I had this problem before... Try this:

#import libraries
from sqlalchemy import create_engine
import pandas as pd

#set login parameters
db = #enter your database schema name here
user = #enter username to login
pw = #type the password here


#connect to database
engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
                        .format(user=user,pw=pw, db=db))


#load data from MySQL into a dataframe
df = pd.read_sql_query("Select * from table", engine)
Sign up to request clarification or add additional context in comments.

10 Comments

Just a note: the sqlalchemy and pymysql imports are not needed. Neither is the explicit connect command. Just df = pd.read_sql(sql=query, con=engine) will do the job. Additionally, user might need to add the database port to the connstr.
I have no user name or pwd, as I normally just open via MySQL
How could I find the database port?
It’s 3306 by default.
Thank you @S3DEV! I ran in my computer with your feedback and you are right! I updated the answer to reflect your inputs.
|
0

I am pretty sure something like this would work

import pyodbc
import pandas as pd

conn = pyodbc.connect(r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:\Users\Ron\Desktop\testdb.accdb;')

SQL_Query = pd.read_sql_query(
'''select
product_name,
product_price_per_unit,
units_ordered,
((units_ordered) * (product_price_per_unit)) AS revenue
from tracking_sales''', conn)

df = pd.DataFrame(SQL_Query, columns=['field1','field2',...])

You can see a good example on this link: https://datatofish.com/sql-to-pandas-dataframe/

1 Comment

This is for Access and OP is using MySQL; therefore this will not work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.