3

I am trying to combine the values of different columns in a single data frame's row into a string, separated by a comma, so that I can create a custom SQL insert string to be executed on a MySQL database. I have 67 different columns, and I am trying to prevent writing code that addresses each column's name individually, mainly to maximize reusability of the code for different size dataframes. I could potentially have anywhere from 1 to 2000 rows to iterate through, with each row having an INSERT query.

For example, if my DataFrame includes the following:

RecDate       WindDir       WindSpeed       OutdoorTemperature       OutdoorHumidity
20160321      121           3               67.5                     43.8
20160322      87            5               73.1                     53.2
20160323      90            2               71.1                     51.7
20160324      103           7               68.3                     47.0

I am wanting to create a string for each row in the dataframe: INSERT INTO tablename VALUES (20160321, 121, 3, 67.5, 43.8) INSERT INTO tablename VALUES (20160322, 87, 5, 73.1, 53.2) INSERT INTO tablename VALUES (20160323, 90, 2, 71.1, 51.7) INSERT INTO tablename VALUES (20160324, 103, 7, 68.3, 47.0)

I have considered using the dataframe's to_sql() function, but was not able to get the code to work with my database structure.

So, my goal was to iterrate through each row, and manually creating the string in the parentheses, separated by a comma:

for index, row in df.iterrows():
   print('INSERT INTO tablename VALUES (%s, %s, %s, %s, %s)' % (row['RecDate'], row['WindDir'], row['WindSpeed'], row['OutdoorTemperature'], row['OutdoorHumidity']))

To make my code "pythonic" and not as rigid, I tried to iterrate through each row, adding a comma between each column index:

for index, row in df.iterrows():
    string = ''

    for x in range(len(row)):
        string += '%s, ' % row[x]

    print('INSERT INTO tablename VALUES (%s)' % string)

I am routinely getting index errors and out of bound errors with the above code, and am not really sure what the correct route to go is. I'd appreciate an inspection of my code and thought process, and any recommendations on how I can improve the code. My goal is to be as efficient as possible, minimize the amount of code I have to write (especially when there's 67 columns!), but still make the code flexible for various uses, especially if the number of columns were to ever change.

Thank you!

2 Answers 2

2

please try below code

def cq_processor(x):
    return 'INSERT INTO tablename VALUES ({})'.format(', '.join(x.tolist()))

df.apply(cq_processor, axis=1)
Sign up to request clarification or add additional context in comments.

2 Comments

That code works. Thank you! I was wondering if it's possible to not have to include 67 sets of {} and a reference to each column name? Maybe somehow loop through the column numbers, adding a comma and the value in that column until the end of the dataframe row is reached?
@AllenH Updated my solution. Please check and try with it.
0

You are getting errors because rows does not support numeric indexing.

In other words, calling rows[1] is not correct. You must call rows['column-name'] instead.

iterrows() does not return a traditional list - it returns a generator of an integer and a Series object. From the source, the function is defined as follows:

columns = self.columns
for k, v in zip(self.index, self.values):
    s = Series(v, index=columns, name=k)
    yield k, s

If you know your pandas, you'll see that the index=columns bit tells the series to accept only the column names as valid indices. When this argument is not specified, then and only then does Series default to allowing integer-based indexing.

tl;dr Do your first approach. It is the correct way to index in this particular Series object. Consider using .format() instead to really make it more Pythonic.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.