Insert new column from 1D array to Numpy 3D array

Question

I have two arrays using the MNIST dataset. First array shape is (60000,28,28) and the second array is (60000,).

Is it possible to combine these and make a new array that is (60000,28,28,1)? I've tried reshaping, resizing, inserting, concatenating and a bunch of other methods to no avail!

Would really appreciate some help! TIA!

The new size has the same number of elements as the original. Step back and experiement with much smaller arrays - ones that you actually examine in full. For example, np.arange((24).reshape(2,3,4). What does it mean to a a size (2,) array to that? — hpaulj
– hpaulj, Commented Jul 27, 2022 at 21:09
This was helpful in seeing how they are structured. Things get more complicated when adding another dimension! Thanks! — avb0101
– avb0101, Commented Jul 28, 2022 at 17:17

Vetle Hofsøy-Woie · Accepted Answer · 2022-07-28 13:01:52Z

2

It seems like you might have misunderstood how numpy arrays work or how they should be used.

Each dimension(except for the inner most dimension) of a an array is essentially just an array of arrays. So for your example with dimension (60000, 28, 28). You have an array with 60000 arrays, which in turn are arrays with 28 arrays. The final array are then a array of 28 objects of some sort.(Integers in the mnist dataset I think).

You can convert this into a (60000, 28, 28, 1) by using numpys expand_dims method like so:

new_array = numpy.expand_dims(original_array, axis=-1)

However, this will only make the last array be an array of 1 objects, and will not include the other array in any way.

From what I can read from your question it seems like you want to map the labels of the mnist dataset with the corresponding image. You could do this by making the object of the outermost dimension a tuple of(image<28x28 numpy array>, label<int>), but this would remove the numpy functionality of the array. The best course of action is probably to keep it as is and using the index of an image to check the label.

answered Jul 28, 2022 at 13:01

Vetle Hofsøy-Woie

19213 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

avb0101 Over a year ago

Yup that's exactly what I'm trying to do (add the image label to the 28x28 corresponding pixel grayscale)! The course I'm following though has all of this data in a Tensorflow tensor. I thought I could create a ndarray then convert to a tensor. Here are some course notes that might help explain: Each observation is 28x28x1 pixels, therefore it is a tensor of rank 3. We must flatten the images using the method 'Flatten' that simply takes our 28x28x1 tensor and orders it into a (None,) or (28x28x1,) = (784,) vector.

Vetle Hofsøy-Woie Over a year ago

You can convert numpy arrays to tensors using tensorflows convert_to_tensor, but this won't include the labels in any way. For many small machine learning tasks you can usually just use numpy arrays as arguments to the learning. If you really want to combine the labels and the features together you should try to read up on tensorflow datasets(tfds)

avb0101 Over a year ago

Found it. So you can take two numpy ndarrays of different dimension and combine them into a TensorSliceDateset like so: mnist_train = tf.data.Dataset.from_tensor_slices((train_images, train_labels)).

hsaltan · Accepted Answer · 2022-07-28 12:30:07Z

1

I think this is not possible. To combine any two arrays, they must have the same dimensions. And any two dimensions in each array must be of the same size.

You can imagine (60,000, 28, 28) array as a cube. The surface looking at you has the dimension of 28 x 28. Thus, all same-size surfaces behind it are 60,000 in number. If you want to add a new entity to it, it must have the same 3-D dimension. And at least two dimensions must match those of the first cube. Otherwise, it won't get concatenated exactly.

To combine (60,000, 28, 28) with another array, the second array should have any two of 60,000, 28, 28 as its dimensions. Let's suppose, the second one has (60,000, 28, 14). Then, you can concatenate and get the result:

z = np.concatenate((array1, array2), axis=2)
z.shape

Output:

(60000, 28, 42)

Alternatively, if the second array is (30,000, 28, 28):

z = np.concatenate((array1, array2), axis=0)
z.shape

Output:

(90000, 28, 28)

answered Jul 28, 2022 at 12:30

hsaltan

5411 gold badge5 silver badges16 bronze badges

1 Comment

avb0101 Over a year ago

Thanks for the reply! Appreciate everyone assisting!

cottontail · Accepted Answer · 2022-08-03 10:02:23Z

0

So you can take two numpy ndarrays of different dimension and combine them into a TensorSliceDateset like so:

mnist_train = tf.data.Dataset.from_tensor_slices((train_images, train_labels))

This was the original intention but I thought it required combining two ND Arrays prior to creating a tensor.

edited Aug 3, 2022 at 10:02

cottontail

25.5k25 gold badges184 silver badges176 bronze badges

answered Aug 1, 2022 at 14:02

avb0101

845 bronze badges

Collectives™ on Stack Overflow

Insert new column from 1D array to Numpy 3D array

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related