0

I am trying to build an desktop app to generate sql queries from excel through pandas data frame. i am able to generate insert statement , but i am receiving data in time_stamp format, i want convert it into to_date format,please suggest a better way to do this. also please suggest to generate select statement by making use of same code

here is my code:

from pandas import *
table_name="ADI"
file_name=pandas.read_excel('supermarke.xlsx')    
def SQL_Insert(SOURCE, TARGET):
    sql_texts = []
    for index, row in SOURCE.iterrows():
        sql_texts.append(
            'INSERT INTO ' + TARGET + ' (' + str(', '.join(SOURCE.columns)) + ')   VALUES ' + str(tuple(row.values))+";")

    return ('\n'.join(sql_texts))
print(SQL_Insert(file_name, table_name))

here is my result:-

INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (1, '3666 21st St', 'San Francisco', 'CA 94114', 'USA', 'Madeira', 8, Timestamp('2018-01-12 00:00:00'));
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (2, '735 Dolores St', 'San Francisco', 'CA 94119', 'USA', 'Bready Shop', 15, Timestamp('2018-01-12 00:00:00'));
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (3, '332 Hill St', 'San Francisco', 'California 94114', 'USA', 'Super River', 25, Timestamp('2018-01-12 00:00:00'));
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (4, '3995 23rd St', 'San Francisco', 'CA 94114', 'USA', "Ben's Shop", 10, Timestamp('2018-01-12 00:00:00'));

and i am trying to add one more functionality , if file not found shows error message.

excel file

X

@Chirag, if i have empty cell value i am receiving output like nan, but when i am going to insert this i am not able to insert it because sql use null, instead of nan, any help in this.?

INSERT INTO ADI (PLAN_ID, DEVICE_ID, PLAN_CONTRACT_DURATION, DEVICE_CONTRACT_DURATION, SALES_CHANNEL, VENDOR_TYPE, EFFECTIVE_DATE, EXPIRATION_DATE, PLAN_NAME, DEVICE, RRP, DEVICE_REPAYMENT, TOTAL_REPAYMENT_CHARGES, TOTAL_CREDIT_CHARGES, URL, EX_VENDOR_TYPE)   VALUES (20637411, 20663271, 1, 1, 'ALL', 'ALL', Timestamp('2018-10-30 00:00:00'), Timestamp('2050-12-31 00:00:00'), 'Unlimited data Home Wireless ($79 Vividwireless)', 'Huawei B315 ', 0, 0, 199, 0, nan, nan);

how to replace nan with NULL/null ?

4
  • 1
    Please share your input data in code-block not via links or image. Commented Nov 17, 2018 at 14:37
  • you should move this question to Code Review Commented Nov 17, 2018 at 15:15
  • @TheCrazyProfessor Why? There is a clear objective stated in this question, it’s not a request for “what could I do better”. Commented Nov 17, 2018 at 18:06
  • @MTCoster "please suggest a better way to do this" This made it a code review Commented Nov 17, 2018 at 19:43

1 Answer 1

1

Something like this?

import os
import pandas as pd
import numpy as np
def SQL_Insert(SOURCE, TARGET):
    sql_texts = []
    for index, row in SOURCE.iterrows():
        sql_texts.append(
            'INSERT INTO ' + TARGET + ' (' + str(', '.join(SOURCE.columns)) + ')   VALUES ' + str(tuple(row.values))+";")

    return ('\n'.join(sql_texts))

# check if file exists
if os.path.isfile("demo.xlsx"):
    # reading file
    df = pd.read_excel('demo.xlsx')
    # casting to date as you mentioned
    df["DATE"] = df.DATE.dt.strftime('%Y-%m-%d')
    # replacin nan with None
    df = df.astype('object').where(pd.notnull(df),None)
    # generating create table statement, in case if you want to use
    print(pd.io.sql.get_schema(df.reset_index(), 'table_name'))
    # calling your function
    q = SQL_Insert(df, "table_name")
    print(q)
else:
    print("File not found")

Output:

CREATE TABLE "table_name" (
"index" INTEGER,
  "ID" INTEGER,
  "Address" TEXT,
  "City" TEXT,
  "Country" TEXT,
  "Supermarket Name" TEXT,
  "Number of Employees" REAL,
  "DATE" TEXT
)
INSERT INTO table_name (ID, Address, City, Country, Supermarket Name, Number of Employees, DATE)   VALUES (1, 'Address 1', 'San Francisco', 'USA', 'Maderia', 8.0, '2018-01-12');
INSERT INTO table_name (ID, Address, City, Country, Supermarket Name, Number of Employees, DATE)   VALUES (2, 'Address 2', 'San Francisco', 'USA', 'Brady Shop', 15.0, '2018-01-12');
INSERT INTO table_name (ID, Address, City, Country, Supermarket Name, Number of Employees, DATE)   VALUES (3, 'Address 3', 'San Francisco', 'USA', 'Super River', 25.0, '2018-01-12');
INSERT INTO table_name (ID, Address, City, Country, Supermarket Name, Number of Employees, DATE)   VALUES (4, 'Address 4', 'San Francisco', 'USA', "Ben's shop", 10.0, '2018-01-12');
INSERT INTO table_name (ID, Address, City, Country, Supermarket Name, Number of Employees, DATE)   VALUES (5, None, 'San Francisco', None, "Ben's shop", None, 'NaT');
Sign up to request clarification or add additional context in comments.

3 Comments

Hi Chirag, regarding your answer of converting date , df["DATE"] = df.DATE.dt.date i don't want specify particular cloumn name in dataframe , its get automatically detected by pandas , i am asking how to convert that in to to-date format
@adityasingh Can you check now for the None?
Thanks For your Response. Can you please suggest a way to replace the timeStamp Format to to_date Format also? i want date in to something like this TO_DATE('11/30/2018 05:53:13', 'MM/DD/YYYY HH24:MI:SS') I am saving result i a file , is it possible to replace date format within output file?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.