split rows in pandas dataframe

Question

I stuck with the problem how to divide a pandas dataframe by row,

I have a similar dataframe with a column where values are separated by \r\n and they are in one cell,

    Color                              Shape  Price
0  Green  Rectangle\r\nTriangle\r\nOctangle     10
1   Blue              Rectangle\r\nTriangle     15

I need to divide this cell into several cells with the same values as other columns, e.g.

   Color      Shape  Price
0  Green  Rectangle     10
1  Green   Triangle     10
2  Green   Octangle     10
3   Blue  Rectangle     15
4   Blue    Tringle     15

How do I do it well?

Try df.Shape.str.split(expand=True).stack(). Does that help? — John Zwinck
– John Zwinck, Commented Oct 23, 2019 at 12:55
@anky_91 explode() was added in version 0.25, is there any other way to solve in older versions? — vb_rises
– vb_rises, Commented Oct 23, 2019 at 12:59
@vb_rises : check : stackoverflow.com/questions/53218931/… — anky
– anky, Commented Oct 23, 2019 at 13:00

Sociopath · Accepted Answer · 2019-10-23 13:00:10Z

17

You can do:

df["Shape"]=df["Shape"].str.split("\r\n")
print(df.explode("Shape").reset_index(drop=True))

Output:

   Color    Shape   Price
0   Green   Rectangle   10
1   Green   Triangle    10
2   Green   Octangle    10
3   Blue    Rectangle   15
4   Blue    Triangle    15

answered Oct 23, 2019 at 13:00

Sociopath

13.4k22 gold badges53 silver badges82 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

MBA Coder Over a year ago

I tried this with a sample df and I got AttributeError: 'DataFrame' object has no attribute 'explode'. Do you have some library you imported that allows you to do that?

Sociopath Over a year ago

I think you are using different pandas version. explode was introduced in 0.25 onwards.

MBA Coder Over a year ago

Thanks Akshay, I have 0.24, I will have to update pandas

MBA Coder · Accepted Answer · 2019-10-23 13:46:56Z

4

This might not be the most efficient way to do it but I can confirm that it works with the sample df:

data = [['Green', 'Rectangle\r\nTriangle\r\nOctangle', 10], ['Blue', 'Rectangle\r\nTriangle', 15]]   
df = pd.DataFrame(data, columns = ['Color', 'Shape', 'Price'])
new_df = pd.DataFrame(columns = ['Color', 'Shape', 'Price'])

for index, row in df.iterrows():
    split = row['Shape'].split('\r\n')
    for shape in split:
        new_df = new_df.append(pd.DataFrame({'Color':[row['Color']], 'Shape':[shape], 'Price':[row['Price']]}))

new_df = new_df.reset_index(drop=True)
print(new_df)

Output:

   Color Price      Shape
0  Green    10  Rectangle
1  Green    10   Triangle
2  Green    10   Octangle
3   Blue    15  Rectangle
4   Blue    15   Triangle

edited Oct 23, 2019 at 13:46

answered Oct 23, 2019 at 13:08

MBA Coder

3841 gold badge3 silver badges13 bronze badges

Comments

Darren Christopher · Accepted Answer · 2019-10-23 13:00:12Z

3

First, you'll need to split the Shape by white spaces, that will give you list of shapes. Then, use df.explode to unpack the list and create new rows for each of them

df["Shape"] = df.Shape.str.split()
df.explode("Shape")

answered Oct 23, 2019 at 13:00

Darren Christopher

4,9594 gold badges25 silver badges40 bronze badges

Comments

Quang Hoang · Accepted Answer · 2019-10-23 13:17:29Z

2

As commented, str.split() followed by explode is helpful. If you are not on Pandas 0.25, then you can use melt afterward:

(pd.concat( (df.Shape.str.split('\r\n', expand=True), 
            df[['Color','Price']]),
          axis=1)
   .melt(id_vars=['Color', 'Price'], value_name='Shape')
   .dropna()
)

Output:

   Color  Price variable      Shape
0  Green     10        0  Rectangle
1   Blue     15        0  Rectangle
2  Green     10        1   Triangle
3   Blue     15        1   Triangle
4  Green     10        2   Octangle

edited Oct 23, 2019 at 13:17

answered Oct 23, 2019 at 13:04

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Collectives™ on Stack Overflow

split rows in pandas dataframe

4 Answers 4

3 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related