Explode multiple uneven rows in Pandas

Question

I have two columns, phones and emails, that need to be exploded into rows. I have figured out how to do this to either one, but not both simultaneously. The biggest problem is that I may have 0 to many phones and 0 to many emails. So, if a customer has three emails and no phones, then I need 3 rows. If they have four phones and three emails, then I need 4 rows. One for each phone, and the three emails in those four rows. Example data:

| many columns | phones | emails |
|:-------------|:------:|:-------|
| row 1        | A,B,C  | A,B    |
| row 2        |        | D,E,F  |

Example Results:

| many columns | phones | emails |
|:-------------|:------:|:-------|
| row 1        | A      | A      |
| row 1        | B      | B      |
| row 1        | C      |        |
| row 2        |        | D      |
| row 2        |        | E      |
| row 2        |        | F      |

# Convert cell contents into lists rather than strings
df0['phones'] = df0['phones'].str.split(";", expand=False)
df0['emails'] = df0['emails'].str.split(",", expand=False)
df0 = df0.apply(pd.Series.explode) # DOES NOT WORK

When I try the above code, I get the error: ValueError: cannot reindex on an axis with duplicate labels

The following link will helpful for you. stackoverflow.com/questions/12680754/… — Berlin Benilo
– Berlin Benilo, Commented Oct 12, 2022 at 15:41

Code Different · Accepted Answer · 2022-10-12 15:48:10Z

1

I assume the index on your original dataframe is unique. If not, run df = df.reset_index() before the following snippet:

columns = ["phones", "emails"]

# Explode each column individually, but instead of using `explode`, we will
# use`stack` to give us a second index level
exploded = [
    df[col].str.split(",", expand=True).stack().rename(col)
    for col in columns
]

# Align the exploded columns
exploded = pd.concat(exploded, axis=1).droplevel(-1)

# Merge it with the original data frame
result = pd.concat([df.drop(columns=columns), exploded], axis=1)

answered Oct 12, 2022 at 15:48

Code Different

93.4k16 gold badges154 silver badges175 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rachel S Over a year ago

Worked amazing except for the last line. I had to change it to df0 = df0.drop(columns=columns) result = df0.join(exploded) to get it to join on index. Thank you soooo much!

Soudipta Dutta · Accepted Answer · 2024-04-18 23:10:27Z

0

import itertools
import pandas as pd
import numpy as np
from pandas import DataFrame as df



df = pd.DataFrame({"x":[1,3,7],"y":["A","B","C"], 
                   "z":["p1,p2,p3","p4","p5,p6"],"package_code":["111,222,333","444","555,666"]})


print(df)
"""
   x  y         z package_code
0  1  A  p1,p2,p3  111,222,333
1  3  B        p4          444
2  7  C     p5,p6      555,666

"""
aa = (
    df.set_index(['x','y'])
    .apply(lambda col : pd.Series(col).str.split(','))
    .explode(['z','package_code'])
    .reset_index()
    .reindex(df.columns,axis=1)
    )
print(aa)

"""
   x  y   z package_code
0  1  A  p1          111
1  1  A  p2          222
2  1  A  p3          333
3  3  B  p4          444
4  7  C  p5          555
5  7  C  p6          666
"""

edited Apr 18, 2024 at 23:10

answered Feb 21, 2023 at 12:10

Soudipta Dutta

2,0721 gold badge16 silver badges11 bronze badges

Collectives™ on Stack Overflow

Explode multiple uneven rows in Pandas

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related