1

Suppose I have this df:

col1 col2 col3 col4
A     B     B    A
B     C     C    D
D    null   D   null

And a list

list1 = ["A","B","C","D"]

How do I create a new df with the boolean representation of the values of the list as first column if the value is in the old df columns?

Expected output:

list1 col1 col2 col3 col4
  A    1    0    0    1
  B    1    1    1    0
  C    0    1    1    0
  D    1    0    1    1

2 Answers 2

1

Try:

res = pd.DataFrame(index=list1, columns=df.columns).fillna(0)

res.loc[:, :] = df.stack().reset_index().pivot_table(index=0, columns="level_1", aggfunc="count").notna().astype(int).droplevel(0, axis=1)

Outputs:

>>> res

   col1  col2  col3  col4
A     1     0     0     1
B     1     1     1     0
C     0     1     1     0
D     1     0     1     1
Sign up to request clarification or add additional context in comments.

Comments

1

This is essentially crosstab:

df.melt().groupby('value')['variable'].value_counts().unstack(fill_value=0)

Output:

variable  col1  col2  col3  col4
value                           
A            1     0     0     1
B            1     1     1     0
C            0     1     1     0
D            1     0     1     1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.