Python Pandas: converting several boolean columns into a (possibly repeated) column made up of the boolean column names

Question

Suppose I have the DataFrame below:

>>> dfrm = pandas.DataFrame({
                             "A":[1,2,3], 
                             "id1":[True, True, False], 
                             "id2":[False, True, False]
                            })

>>> dfrm
   A    id1    id2
0  1   True  False
1  2   True   True
2  3  False  False

How can I flatten the two Boolean columns into a new column (that possibly will cause rows of the DataFrame to need to be repeated), such that in the new column, the entries for all of the True occurrences appear.

Specifically, in the example above, I would want the output to look like this:

index A   id1    id2   all_ids
    0 1  True  False       id1
    1 2  True   True       id1
    1 2  True   True       id2
    2 3 False  False       NaN

(preferably not multi-indexed on all_ids but I would take that if it was the only way to do it).

I've commonly seen this as "wide to long" and the inverse (going from one column to a bunch of Booleans) as "long to wide".

Is there any built-in support for this in Pandas?

ely · Accepted Answer · 2012-09-26 13:43:21Z

2

Off-hand I can't recall a function that does this in pandas as a one-liner, but you can do something like this:

In [35]: st = dfrm.ix[:, ['id1', 'id2']].stack()

In [36]: all_ids = Series(st.index.get_level_values(1), 
                          st.index.get_level_values(0),
                          name='all_ids')[st.values]

In [37]: dfrm.join(all_ids, how='left')
Out[37]: 
   A    id1    id2 all_ids
0  1   True  False     id1
1  2   True   True     id1
1  2   True   True     id2
2  3  False  False     NaN

edited Sep 26, 2012 at 13:43

ely

77.8k36 gold badges158 silver badges234 bronze badges

answered Sep 26, 2012 at 2:48

Chang She

17k8 gold badges43 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

ely Over a year ago

I like this approach, let me test it a bit and get back to you.

Collectives™ on Stack Overflow

Python Pandas: converting several boolean columns into a (possibly repeated) column made up of the boolean column names

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related