Creating array of strings

Question

I have an array of flags for various types as:

Data Type1 Type2 Type3
12   1     0     0
14   0     1     0
3    0     1     0
45   0     0     1

I want to create the following array:

Data TypeName
12   Type1   
14   Type2   
3    Type2   
45   Type3

I tried creating an empty array of type strings as:

import numpy as np
z1 = np.empty(4, np.string_)
z1[np.where(Type1=1)] = 'Type1'

But this doesn't seem to give me desired results.

Edit: I can use pandas dataframe and each row has only 1 type either Type1, Type2, Type3

Edit2: Data Type1 Type2 Type3 are column names as in pandas dataframe but I was using numpy array with the implicit names as I have pointed in the example above.

Is the input a pandas dataframe? Would it always have exactly one 1 at each row starting from the second col? — Divakar
– Divakar, Commented Mar 9, 2017 at 19:21

Divakar · Accepted Answer · 2017-03-09 19:33:40Z

2

Here's an approach abusing the fact that we have exactly one 1 per row starting from Type1 column with idxmax() to get the only occurrence of it per row -

pd.concat((df.Data, df.iloc[:,1:].idxmax(1)),axis=1)

Sample run -

In [42]: df
Out[42]: 
   Data  Type1  Type2  Type3
0    12      1      0      0
1    14      0      1      0
2     3      0      1      0
3    45      0      0      1

In [43]: pd.concat((df.Data, df.iloc[:,1:].idxmax(1)),axis=1)
Out[43]: 
   Data      0
0    12  Type1
1    14  Type2
2     3  Type2
3    45  Type3

edited Mar 9, 2017 at 19:33

answered Mar 9, 2017 at 19:31

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

MaxU - stand with Ukraine Over a year ago

I really like this solution. Actually you don't need pd.concat - df.set_index('Data').idxmax(1).reset_index(name='TypeName')

Community · Accepted Answer · 2017-05-23 12:25:19Z

2

UPDATE: here is a mixture of a brilliant @Divakar's idea to use DataFrame.idxmax(1) method and using set_index() and reset_index() in order to get rid of pd.concat():

In [142]: df.set_index('Data').idxmax(1).reset_index(name='TypeName')
Out[142]:
   Data TypeName
0    12    Type1
1    14    Type2
2     3    Type2
3    45    Type3

OLD answer:

You can do it this way (Pandas solution):

In [132]: df.set_index('Data') \
            .stack() \
            .reset_index(name='val') \
            .query("val == 1") \
            .drop('val', 1)
Out[132]:
    Data level_1
0     12   Type1
4     14   Type2
7      3   Type2
11    45   Type3

edited May 23, 2017 at 12:25

CommunityBot

11 silver badge

answered Mar 9, 2017 at 19:29

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

3 Comments

Divakar Over a year ago

You should add that suggestion of yours into your post. I think it's worth a mention in your post!

MaxU - stand with Ukraine Over a year ago

@Divakar, but the main idea here is to use .idxmax(1) - and you were first... ;)

Divakar Over a year ago

Link to my post! But do add! :)

fuglede · Accepted Answer · 2017-03-09 19:29:53Z

1

One way to do this would be through

df.apply(lambda row: 'Type1' if row.Type1 else 'Type2' if row.Type2 else 'Type3', axis=1)

For example:

In [6]: df
Out[6]: 
   Data  Type1  Type2  Type3
0    12      1      0      0
1    14      0      1      0
2     3      0      1      0
3    45      0      0      1

In [7]: df['TypeName'] = df.apply(lambda row: 'Type1' if row.Type1 else 'Type2' if row.Type2 else 'Type3', axis=1)

In [9]: df.drop(['Type1', 'Type2', 'Type3'], axis=1)
Out[9]: 
   Data TypeName
0    12    Type1
1    14    Type2
2     3    Type2
3    45    Type3

answered Mar 9, 2017 at 19:29

fuglede

18.3k3 gold badges62 silver badges107 bronze badges

Collectives™ on Stack Overflow

Creating array of strings

3 Answers 3

1 Comment

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related