14

I want to convert two numpy array to one DataFrame containing two columns. The first numpy array 'images' is of shape 102, 1024. The second numpy array 'label' is of shape (1020, )

My core code is:

images=np.array(images)
label=np.array(label)
l=np.array([images,label])
dataset=pd.DataFrame(l)

But it turns out to be an error saying that:

ValueError: could not broadcast input array from shape (1020,1024) into shape (1020)

What should I do to convert these two numpy array into two columns in one dataframe?

1

3 Answers 3

31

You can't stack them easily, especially if you want them as different columns, because you can't insert a 2D array in one column of a DataFrame, so you need to convert it to something else, for example a list.

So something like this would work:

import pandas as pd
import numpy as np
images = np.array(images)
label = np.array(label)
dataset = pd.DataFrame({'label': label, 'images': list(images)}, columns=['label', 'images'])

This will create a DataFrame with 1020 rows and 2 columns, where each item in the second column contains 1D arrays of length 1024.

Sign up to request clarification or add additional context in comments.

Comments

2

Coming from engineering, I like the visual side of creating matrices.

matrix_aux = np.vstack([label,images])
matrix     = np.transpose(matrix_aux)
df_lab_img = pd.DataFrame(matrix)

Takes a little bit more of code but leaves you with the Numpy array too.

Comments

0

You can also use hstack

import pandas as pd
import numpy as np

dataset = pd.DataFrame(np.hstack((images, label.reshape(-1, 1))))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.