0

I have a dataframe df with a single column that contains arrays of length 3. Now, I want to transform this column to a numpy array of the correct shape. However, applying np.reshape does not work. How can I do this?

Here is a brief example:

import pandas as pd
import numpy as np

df = pd.DataFrame(columns=['col'])
for i in range(10):
    df.loc[i,'col'] = np.zeros(3)

arr = np.array(df['col'])
np.reshape(arr, (10,3)) # This does not work

1 Answer 1

1

Here are two approaches using np.vstack and np.concatenate -

np.vstack(df.col)
np.concatenate(df.col).reshape(df.shape[0],-1) # for performance

For best performance, we could use the underlying data with df.col.values instead.

Sample run -

In [116]: df
Out[116]: 
         col
0  [7, 5, 2]
1  [1, 1, 3]
2  [6, 1, 4]
3  [7, 0, 0]
4  [8, 8, 0]
5  [7, 8, 0]
6  [0, 5, 8]
7  [8, 3, 1]
8  [6, 6, 8]
9  [8, 2, 3]

In [117]: np.vstack(df.col)
Out[117]: 
array([[7, 5, 2],
       [1, 1, 3],
       [6, 1, 4],
       [7, 0, 0],
       [8, 8, 0],
       [7, 8, 0],
       [0, 5, 8],
       [8, 3, 1],
       [6, 6, 8],
       [8, 2, 3]])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.