How to create columns from specific values per row in Python

Question

I have similar df like below,

df = pd.DataFrame({'DRINKS':['WHISKEY','VODKA','WATER'],
                    'STRONG':[5,5,0],
                    'SOUR':[5,4,0]})

And I want to transform it to this one (Read 5s from the dataframe and when it matches, create a column with whatever name(I named it Cat1) and get the column name(STRONG) where the value was 5, then move on tho the next column and do the same operation until there no columns with rows with a value 5. The final outcome should be like below:

df = pd.DataFrame({'DRINKS':['WHISKEY','VODKA','WATER'],
                    'Cat1':["STRONG","STRONG",np.nan],
                    'Cat2':["SOUR",np.nan,np.nan]})

I tried to do it with

df['Cat1']=(df == 5).idxmax(axis=1)

but it gives me only 1 column name for Whiskey.

Any help will be appreciated

jezrael · Accepted Answer · 2022-09-27 05:47:50Z

1

Select and set all columns without first by DataFrame.iloc and numpy.where:

df = df.iloc[:, :1].join(pd.DataFrame(np.where(df.iloc[:, 1:].eq(5),df.columns[1:],np.nan),
                            index=df.index, 
                            columns=[f'Cat{i}' for i,_ in enumerate(df.columns[1:], 1)]))
print (df)
    DRINKS    Cat1  Cat2
0  WHISKEY  STRONG  SOUR
1    VODKA  STRONG   NaN
2    WATER     NaN   NaN

Or:

df.iloc[:, 1:] = np.where(df.iloc[:, 1:].eq(5), df.columns[1:], np.nan)
df = df.rename(columns=dict(zip(df.columns[1:],
                               [f'Cat{i}' for i,_ in enumerate(df.columns[1:], 1)])))
print (df)
    DRINKS    Cat1  Cat2
0  WHISKEY  STRONG  SOUR
1    VODKA  STRONG   NaN
2    WATER     NaN   NaN

edited Sep 27, 2022 at 5:47

answered Sep 27, 2022 at 5:16

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Khaled Koubaa · Accepted Answer · 2022-09-26 20:24:04Z

0

try:

df['Cat1'] = np.where(df[df.columns[1]].eq(5), df.columns[1], np.nan) #or df['Cat1'] = np.where(df["STRONG"].eq(5), "STRONG", np.nan)
df['Cat2'] = np.where(df[df.columns[2]].eq(5), df.columns[2], np.nan) #or df['Cat2'] = np.where(df["SOUR"].eq(5), "SOUR", np.nan)

    DRINKS  STRONG  SOUR    Cat1    Cat2
0   WHISKEY 5       5       STRONG  SOUR
1   VODKA   5       4       STRONG  NaN
2   WATER   0       0       NaN     NaN

df = df.drop(columns=['STRONG', 'SOUR'])

    DRINKS  Cat1    Cat2
0   WHISKEY STRONG  SOUR
1   VODKA   STRONG  NaN
2   WATER   NaN     NaN

edited Sep 26, 2022 at 20:24

answered Sep 26, 2022 at 20:18

Khaled Koubaa

5476 silver badges19 bronze badges

2 Comments

segababa Over a year ago

Hey, thanks for the answer but i have to create cat1 and cat2 automatically

Khaled Koubaa Over a year ago

can you explain what you mean by automatically

sitting_duck · Accepted Answer · 2022-09-27 04:54:18Z

0

You could map each column value of 5 to the column header. The core part of that would be:

df.iloc[:,1:].apply(lambda x: x.map({5:x.name}))

Which delivers:

   STRONG  SOUR
0  STRONG  SOUR
1  STRONG   NaN
2     NaN   NaN

Then you could put it all together with a column rename:

dfo = (
    pd.concat([df['DRINKS'],df.iloc[:,1:].apply(lambda x: x.map({5:x.name}))
               .rename(columns=lambda x: f"Cat{df.columns.get_loc(x)}")], axis=1)
)

print(dfo)

Result

    DRINKS    Cat1  Cat2
0  WHISKEY  STRONG  SOUR
1    VODKA  STRONG   NaN
2    WATER     NaN   NaN

edited Sep 27, 2022 at 4:54

answered Sep 27, 2022 at 0:45

sitting_duck

3,7901 gold badge17 silver badges20 bronze badges

2 Comments

Rabinzel Over a year ago

you could change the rename part to .rename(columns=lambda x: f"Cat{df.columns.get_loc(x)}") to get the exact result of OP.

sitting_duck Over a year ago

@Rabinzel Good call. That's better. I updated my answer with that.

Collectives™ on Stack Overflow

How to create columns from specific values per row in Python

3 Answers 3

Comments

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related