I am trying to join 2 tables in Python. (Using Windows, jupyter notebook.)
Table 1 is an excel file read in using pandas.
TABLE_1= pd.read_excel('my_file.xlsx')
Table 2 is a large table in oracle database that I can connect to using pyodbc. I can read in the entire table successfully using pyodbc like this, but it takes a very long time to run.
sql = "SELECT * FROM ORACLE.table_2"
cnxn = odbc.connect(##########)
TABLE_2 = pd.read_sql(sql, cnxn)
So I would like to do an inner join as part of the pyodbc import, so that it runs faster and I only pull in the needed records. Table 1 and Table 2 share the same unique identifier/primary key.
sql = "SELECT * FROM ORACLE.TABLE_1 INNER JOIN TABLE_2 ON ORACLE.TABLE1.ID=TABLE_2.ID"
cnxn = odbc.connect(##########)
TABLE_1_2_JOINED = pd.read_sql(sql, cnxn)
But this doesn't work. I get this error:
DatabaseError: Execution failed on sql 'SELECT * FROM ORACLE.TABLE_1
INNER JOIN TABLE_2 ON ORACLE.TABLE1.ID=TABLE_2.ID': ('42S02', '[42S02]
[Oracle][ODBC][Ora]ORA-00942: table or view does not exist\n (942)
(SQLExecDirectW)')
Is there another way I can do this? It seems very inefficient to have to import entire table w/millions of records when I only need to join a few hundred. Thank you.
TABLE_1also exist in the database?