2

I have an array of flags for various types as:

Data Type1 Type2 Type3
12   1     0     0
14   0     1     0
3    0     1     0
45   0     0     1

I want to create the following array:

Data TypeName
12   Type1   
14   Type2   
3    Type2   
45   Type3   

I tried creating an empty array of type strings as:

import numpy as np
z1 = np.empty(4, np.string_)
z1[np.where(Type1=1)] = 'Type1'

But this doesn't seem to give me desired results.

Edit: I can use pandas dataframe and each row has only 1 type either Type1, Type2, Type3

Edit2: Data Type1 Type2 Type3 are column names as in pandas dataframe but I was using numpy array with the implicit names as I have pointed in the example above.

2
  • 1
    Is the input a pandas dataframe? Would it always have exactly one 1 at each row starting from the second col? Commented Mar 9, 2017 at 19:21
  • Can you show us how could we create such an input array? Commented Mar 9, 2017 at 19:26

3 Answers 3

2

Here's an approach abusing the fact that we have exactly one 1 per row starting from Type1 column with idxmax() to get the only occurrence of it per row -

pd.concat((df.Data, df.iloc[:,1:].idxmax(1)),axis=1)

Sample run -

In [42]: df
Out[42]: 
   Data  Type1  Type2  Type3
0    12      1      0      0
1    14      0      1      0
2     3      0      1      0
3    45      0      0      1

In [43]: pd.concat((df.Data, df.iloc[:,1:].idxmax(1)),axis=1)
Out[43]: 
   Data      0
0    12  Type1
1    14  Type2
2     3  Type2
3    45  Type3
Sign up to request clarification or add additional context in comments.

1 Comment

I really like this solution. Actually you don't need pd.concat - df.set_index('Data').idxmax(1).reset_index(name='TypeName')
2

UPDATE: here is a mixture of a brilliant @Divakar's idea to use DataFrame.idxmax(1) method and using set_index() and reset_index() in order to get rid of pd.concat():

In [142]: df.set_index('Data').idxmax(1).reset_index(name='TypeName')
Out[142]:
   Data TypeName
0    12    Type1
1    14    Type2
2     3    Type2
3    45    Type3

OLD answer:

You can do it this way (Pandas solution):

In [132]: df.set_index('Data') \
            .stack() \
            .reset_index(name='val') \
            .query("val == 1") \
            .drop('val', 1)
Out[132]:
    Data level_1
0     12   Type1
4     14   Type2
7      3   Type2
11    45   Type3

3 Comments

You should add that suggestion of yours into your post. I think it's worth a mention in your post!
@Divakar, but the main idea here is to use .idxmax(1) - and you were first... ;)
Link to my post! But do add! :)
1

One way to do this would be through

df.apply(lambda row: 'Type1' if row.Type1 else 'Type2' if row.Type2 else 'Type3', axis=1)

For example:

In [6]: df
Out[6]: 
   Data  Type1  Type2  Type3
0    12      1      0      0
1    14      0      1      0
2     3      0      1      0
3    45      0      0      1

In [7]: df['TypeName'] = df.apply(lambda row: 'Type1' if row.Type1 else 'Type2' if row.Type2 else 'Type3', axis=1)

In [9]: df.drop(['Type1', 'Type2', 'Type3'], axis=1)
Out[9]: 
   Data TypeName
0    12    Type1
1    14    Type2
2     3    Type2
3    45    Type3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.