I have two lists : one contains the column names of categorical variables and the other numeric as shown below.
cat_cols = ['stat','zip','turned_off','turned_on']
num_cols = ['acu_m1','acu_cnt_m1','acu_cnt_m2','acu_wifi_m2']
These are the columns names in a table in Redshift.
I want to pass these as a parameter to pull only numeric columns from a table in Redshift(PostgreSql),write that into a csv and close the csv.
Next I want to pull only cat_cols and open the csv and then append to it and close it.
my query so far:
#1.Pull num data:
seg = ['seg1','seg2']
sql_data = str(""" SELECT {num_cols} """ + """FROM public.""" + str(seg) + """ order by random() limit 50000 ;""")
df_data = pd.read_sql(sql_data, cnxn)
# Write to csv.
df_data.to_csv("df_sample.csv",index = False)
#2.Pull cat data:
sql_data = str(""" SELECT {cat_cols} """ + """FROM public.""" + str(seg) + """ order by random() limit 50000 ;""")
df_data = pd.read_sql(sql_data, cnxn)
# Append to df_seg.csv and close the connection to csv.
with open("df_sample.csv",'rw'):
## Append to the csv ##
This is the first time I am trying to do selective querying based on python lists and hence stuck on how to pass the list as column names to select from table.
Can someone please help me with this?