1

How do you sort a numpy array by a nested dtype?

I want to sort a numpy array by the first element inside the array

import numpy as np
from random import randint

# create dummy data
test = np.array([[[randint(1, 10) for _ in range(3)]] for _ in range(10)])

dtype = [('response', [('x', 'f'),('y', 'f'),('x', 'f')])]

# convert over to dtype
test.astype(dtype)

How do I sort on a nested key? as the below doesn't work

np.sort(a, order='response.x')

What I would like to achieve is to use np.sort in the same way I would use sorted for a list

a_list = [[[randint(1, 10) for _ in range(3)]
sorted(a_list,key=lambda x: (x[0][0]))

But I would like to use np.sort as this is a sample of a much more complicated problem where I only have access to numpy arrays and would like to work with the numpy methods.

4
  • Look at my answer to stackoverflow.com/q/61906820/901925. Commented May 22, 2020 at 14:32
  • Is the shape of test supposed to be (10, 1, 3)? Commented May 22, 2020 at 14:36
  • Yes, it is supposed to be (10,1,3) Commented May 22, 2020 at 14:40
  • Your example is not reproducible. You probably meant to name the last field z. Commented May 22, 2020 at 15:26

1 Answer 1

0

I understand you have a tensor (three nested arrays), and each array element is a structured data type with nested fields response.x, response.y and response.z.

Since you have a tensor, you can order it in three dimensions (rows, columns and Z-dimension). Default numpy sort behavior is to sort the last dimension, i.e. the innermost array.

To get the sorted indices of the nested data structure, we can use numpy.argsort(). It gives you the sorted indices. For example, the following orders the response.xs along the outermost array (i.e. compares rows with each other):

order = np.argsort(test['response']['x'], axis=0)

You can then use these indices to get the original array ordered. To get the same behavior as np.sort, you would use numpy.take_along_axis() with the same axis argument:

np.take_along_axis(A, order, axis=0)

Note that this orders individual elements along the rows. This is the same behavior that np.sort(..., axis=0) would give you.

However, it seems to me you want to order the entire inner arrays by their first element, i.e. no individual reordering in the inner arrays. To do so, row-by-row, you would do:

test[order[:,0,0]]

This orders the outermost arrays as a whole by the first innermost item (0, 0).

The default behavior of np.sort would be to order the last dimension, this would be achieved by changing the axis argument in the first example above to -1 (or removing it completely) in both usages above.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.