3

I have a dataframe like this with Boolean values

black yellow orange
TRUE TRUE TRUE
FALSE TRUE FALSE
TRUE TRUE FALSE
FALSE FALSE TRUE

I want a separate column that summarizes the Boolean values based on column name which the column would be

summary
black, yellow, orange
yellow
black, yellow
orange

Any idea how to do this please? Thanks!

2 Answers 2

1

You can use each row as a selection mask to filter the column names:

(
    df.astype("bool")
    .apply(lambda row: ", ".join(df.columns[row]), axis=1)
    .to_frame("summary")
)
Sign up to request clarification or add additional context in comments.

Comments

1

Try this using pd.DataFrame.dot:

df_colors['summary'] = df_colors.dot(df_colors.columns+', ').str.strip(', ')
df_colors

Output:

   black  yellow  orange                summary
0   True    True    True  black, yellow, orange
1  False    True   False                 yellow
2   True    True   False          black, yellow
3  False   False    True                 orange

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.