Creating new columns using the for loop in pandas

Question

I am new to python and therefore in pandas data frames as well. Lets say that I have a following data set:

d = {'a': [1, 1, 1, 2, 2, 2, 3, 3, 3], 'b': [4, 4, 4, 5, 5, 5, 6, 6, 6]}
   ...: df = pd.DataFrame(data=d)
   ...: df
   ...: 
Out[20]: 
   a  b
0  1  4
1  1  4
2  1  4
3  2  5
4  2  5
5  2  5
6  3  6
7  3  6
8  3  6

What I want to do is to create new columns lets say b_1, b_2, b_3, based on the information I have in column a and b. The final data should look like this:

Out[21]: 
   a  b  b_1  b_2  b_3
0  1  4    4    0    0
1  1  4    4    0    0
2  1  4    4    0    0
3  2  5    0    5    0
4  2  5    0    5    0
5  2  5    0    5    0
6  3  6    0    0    6
7  3  6    0    0    6
8  3  6    0    0    6

In Stata this is achieved through the following command:

forvalues i=1(1)3{
gen b_`i'=b if a==`i'
replace b_`i'=0 if b_`i'==.
}

Any similar way of doing it in python? Thanks in advance

df.join(pd.DataFrame({f'b_{i}':x['b'] for i, x in df.groupby('a')}).fillna(0)) ..? — Chris Adams
– Chris Adams, Commented Mar 3, 2021 at 9:31

jezrael · Accepted Answer · 2021-03-03 09:35:00Z

1

Use DataFrame.join with Series.unstack and DataFrame.add_prefix:

df = df.join(df.set_index('a', append=True)['b'].unstack(fill_value=0).add_prefix('b_'))
print (df)
   a  b  b_1  b_2  b_3
0  1  4    4    0    0
1  1  4    4    0    0
2  1  4    4    0    0
3  2  5    0    5    0
4  2  5    0    5    0
5  2  5    0    5    0
6  3  6    0    0    6
7  3  6    0    0    6
8  3  6    0    0    6

edited Mar 3, 2021 at 9:35

answered Mar 3, 2021 at 9:29

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Creating new columns using the for loop in pandas

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related