1

current dataframe

I have a data frame that looks like the image above. What I want to do is loop through the SQL statements under SQL_SCRIPT, execute them, and store the results in the next column over which would be called 'RESULTS'. When I just try and execute it (without storing it anywhere) it runs fine, but when I try and store the results in a new dataframe column it errors out with:

ValueError: cannot set a row with mismatched columns

Here is the code:

def run_tests(self):
    s = self.connection()
    df = self.retrieve_sql()
    df_type = df.loc[df['STEP_TYPE'] == 'T']
    df_to_list = df_type[['TABLE_NM', 'TEST_TABLE_NM', 'SQL_SCRIPT']]
    print(df_to_list)
    for sql_script in df_to_list['SQL_SCRIPT']:
        df_to_list.loc['RESULTS'] = pd.read_sql(sql_script,s)
    print(df_to_list)

Instead of read_sql I have also tried just using the session execute, which also works but I'm not sure how to store the results to the dataframe going that route:

def run_tests(self):
    s = self.connection()
    df = self.retrieve_sql()
    df_type = df.loc[df['STEP_TYPE'] == 'T']
    df_to_list = df_type[['TABLE_NM', 'TEST_TABLE_NM', 'SQL_SCRIPT']]
    print(df_to_list)
    for sql_script in df_to_list['SQL_SCRIPT']:
        s.execute(sql_script)

Here is the connection function, if needed:

def connection(self):
    con = self.load_json_file()
    cfg_dsn = con['config']['dsn']
    cfg_usr = con['config']['username']
    cfg_pwd = con['config']['password']

    udaExec = teradata.UdaExec(appName="DataAnalysis", version="1.0", logConsole=False)
    session = udaExec.connect(method="odbc", dsn=cfg_dsn, username=cfg_usr, password=cfg_pwd)

    return session

1 Answer 1

2

Consider running Series.apply on the column of SQL strings.

def run_tests(self):
    s = self.connection()
    c = s.cursor()              # OPEN CURSOR
    df = self.retrieve_sql()

    df_type = df.loc[df['STEP_TYPE'] == 'T']
    df_to_list = df_type[['TABLE_NM', 'TEST_TABLE_NM', 'SQL_SCRIPT']]
    print(df_to_list)

    # NEW METHOD TO RUN QUERY
    def sql_run(x):   
        c.execute(x)
        if c.rowcount > 0:
           res = c.fetchone()[0]
        else:
           res = np.nan
        return res

    df_to_list['RESULTS'] = df_to_list['SQL_SCRIPT'].apply(sql_run)
    print(df_to_list)
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you @Parfait. This is returning 'NoneType' object is not subscriptable'. I think this may be because some SELECT queries aren't count(*) and they return nothing (not 0) and that is expected. Is there a way to handle this? By putting a default value in the dataframe cell instead? I am not sure. Sorry, I didnt include that in the original question but I think that may be the issue?
Add a handler in apply function. See update with if checking cursor row count.
Thank you Parfait! That worked perfectly. Very, very much appreciated.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.