How to insert a pandas dataframe having a single csv column into MySQL Database

Question

I have a pandas dataframe that I read from google sheet. I then added the tag column using:

df['tag'] = df.filter(like = 'Subject', axis = 1).apply(lambda x: np.where(x == 'Y', x.name,'')).values.tolist()
df['tag'] = df['tag'].apply(lambda x: [i for i in x if i!= ''])

Resultant sample DataFrame:

    Id  Name    Subject-A   Subject-B   Total   tag
0   1   A       Y                       100     [Subject-A]
1   2   B                   Y           98      [Subject-B]
2   3   C       Y           Y           191     [Subject-A, Subject-B]
3   4   D                   Y           100     [Subject-B]
4   5   E                   Y           95      [Subject-B]

Then I export the dataframe to a MySQL Database after converting the tag column into a comma separated string by:

df['tag'] = df['tag'].map(lambda x : ', '.join(str(i) for i in x)).str.replace('Subject-','')
df

    Id  Name    Subject-A   Subject-B   Total   tag
0   1   A       Y                       100     A
1   2   B                   Y           98      B
2   3   C       Y           Y           91      A, B
3   4   D                   Y           100     B
4   5   E                   Y           95      B

df.to_sql(name = 'table_name', con = conn, if_exists = 'replace', index = False)

But in the MySQL database the tag columns is:

A,
,B
A,B
,B
,B

My actual data has many such "Subject" columns so the result looks like:

, , , D
A, ,C,
...
...

Could someone please let me know why it's giving expected out in Pandas but when I save the dataframe in cloud SQL, the column looks different. The expected output in MySQL database is same as how the tag column is appearing in Pandas.

It seems like for some reason is removed code df['tag'] = df['tag'].apply(lambda x: [i for i in x if i!= '']) in your solution. — jezrael
– jezrael, Commented Nov 2, 2021 at 8:12
But thanks for your answer, can use it when there's a data problem. — Karthik S
– Karthik S, Commented Nov 2, 2021 at 8:26

jezrael · Accepted Answer · 2021-11-02 08:09:18Z

1

Here is alternative solution, seems some data related problem.

First filter Subject columns with remove Subject- and then use DataFrame.dot with columns names with separator, last strip separator from right side:

df1 = df.filter(like = 'Subject').rename(columns=lambda x: x.replace('Subject-',''))
print (df1)
     A    B
0    Y  NaN
1  NaN    Y
2    Y    Y
3  NaN    Y
4  NaN    Y

df['tag'] = df1.eq('Y').dot(df1.columns  + ', ').str.rstrip(', ')
print (df)
   Id Name Subject-A Subject-B  Total   tag
0   1    A         Y       NaN    100     A
1   2    B       NaN         Y     98     B
2   3    C         Y         Y    191  A, B
3   4    D       NaN         Y    100     B
4   5    E       NaN         Y     95     B

answered Nov 2, 2021 at 8:09

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to insert a pandas dataframe having a single csv column into MySQL Database

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related