1

I have and ndarray defined in the following way:

dataset = np.ndarray(shape=(len(image_files), image_size, image_size),
                         dtype=np.float32)

This array represents a collection of images of size image_size * image_size. So I can say, dataset[0] and get a 2D table corresponding to an image with index 0.

Now I would like to have one additional field for each image in this array. For instance, for image located at index 0, I would like to store number 123, for an image located at index 321 I would like to store number 50000.

What is the simplest way to add this additional data field to the existing ndarray? What is the appropriate way to access data in the new array after adding this additional dimension?

4
  • I don't think adding a 4th dimension to your array is the right approach. Perhaps a dictionary is a better approach? The images could be one value, and the number could be a second value, each stored with different keys. Commented Feb 27, 2021 at 0:40
  • Do you mean an additional python dictionary that would have a key of image_data and a value would be that number that I want to add ? The problem with this is that in the existing code dataset mentioned above is already used. That is way it would be problematic to change the data structure. Commented Feb 27, 2021 at 0:57
  • For instance, in the existing code, I use np.random.shuffle on that array. If some info will be in a map, a correspondence between image and its property will be lost Commented Feb 27, 2021 at 1:07
  • I see your dilemma. I suggested this approach because, as @Bobby Ocean alludes, if you were to add an extra dimension to your array, it would make the data exponentially bigger. A 100x100X1 (10,000 element) array would turn into 100x100x2 (20,000 element) array to store 1 extra number. Now expand this into 4 dimensions with larger images... Commented Feb 27, 2021 at 1:43

2 Answers 2

1

If you shuffle an index array instead of the dataset itself, you can keep track of the original 'identifiers'

idx = np.arange(len(image_files))
np.random.shuffle(idx)
shuffle_set = dataset[idx]

illustration:

In [20]: x = np.arange(12).reshape(6,2)
    ...: idx = np.arange(6)
    ...: np.random.shuffle(idx) 
In [21]: x
Out[21]: 
array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])
In [22]: x[idx]             # shuffled
Out[22]: 
array([[ 4,  5],
       [ 0,  1],
       [ 2,  3],
       [ 6,  7],
       [10, 11],
       [ 8,  9]])
In [23]: idx1=np.argsort(idx)
In [24]: idx
Out[24]: array([2, 0, 1, 3, 5, 4])
In [25]: idx1
Out[25]: array([1, 2, 0, 3, 5, 4])
In [26]: Out[22][idx1]       # recover original order
Out[26]: 
array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, @hpaulj. That is exactly what I needed. That solved my problem.
0

Numpy arrays are fundamentally tensors, i.e., they have a shape that is absolute across the axes. Meaning that the shape is fixed and not variable. Take for example,

import numpy as np

x = np.array([[[1,2],[3,4]],
              [[5,6],[7,8]]
             ])
print(x.shape) #Here we have two, 2x2s. Shape = (2,2,2)

If I want to associate x[0] to the number 5 and x[1] to the number 7, then that would be something like (if it was possible):

x = np.array([[[1,2],[3,4]],5,
              [[5,6],[7,8]],7
             ])

But such thing is impossible, since it would "in some sense" have a shape that corresponds to (2,((2,2),1)), or something else that is ambiguous. Such an object is not a numpy array or a tensor. It doesn't have fixed axis sizes. All numpy arrays must have fixed axis sizes. Hence, if you wish to store the new information, the only way to do it, is to create another array.

x = np.array([[[1,2],[3,4]],
              [[5,6],[7,8]],
             ])
y = np.array([5,7])

Now x[0] corresponds to y[0] and x[1] corresponds to y[1]. x has shape (2,2,2) and y has shape (2,).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.