pandas add rows based on column value

Question

Assuming I have this dataframe:

    X     Y
0   a     1
1   b    10
2   c    11
3   d   100
4   e   101
5   f   110
6   g   111

I would like to decompose the column Y into rows so that each number with more than one digit 1 is broken into another number with only one digit 1. For example, the number 111, is broken to 3 rows with values 100, 10, and 1, and keeps the information from other columns. Here is a visualization of what I expect:

    X     Y
0   a     1
1   b    10
2   c    10
3   c     1
4   d   100
5   e   100
6   e     1
7   f   100
8   f    10
9   g   100
10  g    10
11  g     1

Here is what I have done so far, but I wonder if there is a more pythonic way to do it. Appreciate your help in advance.

import pandas as pd

df = pd.DataFrame({ 'X':['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o'], 'Y':[1,10,11,100,101,110,111,1000,1001,1010,1011,1100,1101,1110,1111] })
print(df)

for i in range(len(df)):
    value  = int(df.at[i,'Y'])
    digits = len(str(value))
    opts   = sum(map(int, str(value)))
    
    if opts > 1:
        # assign first value
        temp = df.loc[[i]]
        temp.at[i,'Y'] = 10**(digits-1)
        
        # update row
        df   = df.drop([i])
        df   = df.append(temp)

        # append new rows
        while value != 1:
            value  = int(str(value)[1:])
            if value == 0: break
            digits = len(str(value))
            temp.at[i,'Y'] = 10**(digits-1)
            df     = df.append(temp)
            
df = df.sort_index()
df = df.reset_index(drop=True)
print(df)

What about 117? Should it become 100, 10, 7? Or 100, 10, 1 seven times? Or 17 as 10 and 1 seven times? — Joe Ferndz
– Joe Ferndz, Commented Jan 22, 2021 at 23:23
Once you have a decompose function that can take a number and return its 10 powers as a list you can simply explode to get what you need. Check out the solution I proposed below. — Akshay Sehgal
– Akshay Sehgal, Commented Jan 23, 2021 at 0:03

Akshay Sehgal · Accepted Answer · 2021-01-23 00:26:41Z

2

Try applying a decompose function like this and then simply explode.

The decompose function takes 111 and returns [100,10,1] or takes 110 and returns [100,10]. It does it by enumerating over each of the digits as a string in reverse and multiplying it to 10^i. Then returning it as a list without any 0s.

decompose = lambda x: [10**i for i,j in enumerate(str(x)[::-1]) if int(j)!=0]

df['Y'] = df['Y'].apply(decompose)
out = df.explode('Y')
print(out)

EDIT: This is only for the specific condition mentioned by OP where The number can only be made of zeros and ones. In general cases, please use the lambda function - Courtsey @Joe Ferndz

lambda x: [10**i for i,a in enumerate(str(x)[::-1]) for _ in range(int(a))]

edited Jan 23, 2021 at 0:26

answered Jan 23, 2021 at 0:00

Akshay Sehgal

19.4k3 gold badges26 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Joe Ferndz Over a year ago

Upvoted your answer. and deleted my answer. Btw. your answer will not solve for 1234. It will just give 1000, 100,10,1 and not give it 1000, 100, 100, 10,10,10,1,1,1,1 as my solution gives. You need to iterate through the list once more

Joe Ferndz Over a year ago

Here's what you need to do df['Y'] = df['Y'].apply(lambda x: [10**i for i,a in enumerate(str(x)[::-1]) for _ in range(int(a))])

Akshay Sehgal Over a year ago

Yes, I had proposed a more general function earlier for that, but OP mentioned its going to be only 1s and 0s. The number can only be made of zeros and ones

Collectives™ on Stack Overflow

pandas add rows based on column value

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related