Python Pandas: selecting 1st element in array in all cells

Question

What I am trying to do is select the 1st element of each cell regardless of the number of columns or rows (they may change based on user defined criteria) and make a new pandas dataframe from the data. My actual data structure is similar to what I have listed below.

       0       1       2
0   [1, 2]  [2, 3]  [3, 6]
1   [4, 2]  [1, 4]  [4, 6]
2   [1, 2]  [2, 3]  [3, 6]
3   [4, 2]  [1, 4]  [4, 6]

I want the new dataframe to look like:

    0   1   2
0   1   2   3
1   4   1   4
2   1   2   3
3   4   1   4

The code below generates a data set similar to mine and attempts to do what I want to do in my code without success (d), and mimics what I have seen in a similar question with success(c ; however, only one column). The link to the similar, but different question is here :Python Pandas: selecting element in array column

import pandas as pd

zz = pd.DataFrame([[[1,2],[2,3],[3,6]],[[4,2],[1,4],[4,6]],
               [[1,2],[2,3],[3,6]],[[4,2],[1,4],[4,6]]])
print(zz)

x= zz.dtypes
print(x)

a = pd.DataFrame((zz.columns.values))
b = pd.DataFrame.transpose(a) 
c =zz[0].str[0] # this will give the 1st value for each cell in columns 0
d= zz[[b[0]].values].str[0] #attempt to get 1st value for each cell in all columns

jezrael · Accepted Answer · 2017-01-17 22:22:29Z

12

You can use apply and for selecting first value of list use indexing with str:

print (zz.apply(lambda x: x.str[0]))
   0  1  2
0  1  2  3
1  4  1  4
2  1  2  3
3  4  1  4

Another solution with stack and unstack:

print (zz.stack().str[0].unstack())
   0  1  2
0  1  2  3
1  4  1  4
2  1  2  3
3  4  1  4

answered Jan 17, 2017 at 22:22

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Devon Oliver Over a year ago

Thanks!!! I'm going to try both your methods and see what runs fastest.

Ted Petrou · Accepted Answer · 2017-01-17 22:35:09Z

5

I would use applymap which applies the same function to each individual cell in your DataFrame

df.applymap(lambda x: x[0])

   0  1  2
0  1  2  3
1  4  1  4
2  1  2  3
3  4  1  4

answered Jan 17, 2017 at 22:35

Ted Petrou

62.4k19 gold badges139 silver badges139 bronze badges

Comments

piRSquared · Accepted Answer · 2017-01-17 23:28:28Z

3

I'm a big fan of stack + unstack
However, @jezrael already put that answer down... so + 1 from me.

That said, here is a quicker way. By slicing a numpy array

pd.DataFrame(
    np.array(zz.values.tolist())[:, :, 0],
    zz.index, zz.columns
)

   0  1  2
0  1  2  3
1  4  1  4
2  1  2  3
3  4  1  4

timing

answered Jan 17, 2017 at 23:28

piRSquared

296k68 gold badges509 silver badges654 bronze badges

1 Comment

Devon Oliver Over a year ago

Thanks for the timing info and new method..+1

Collectives™ on Stack Overflow

Python Pandas: selecting 1st element in array in all cells

3 Answers 3

1 Comment

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related