1

Some answers on stackoverflow suggest to use a ndarray of ndarray, when working with data in which the number of elements per row is not constant (How to make a multidimension numpy array with a varying row size?).

Is numpy optimized to work on a structure like that (array of arrays, also called nested arrays) ?

Here's a simplified example of such a structure:

import numpy as np
x = np.array([1,2,3])
y = np.array([4,5])
data = np.array([x,y],dtype=object)

It's possible to do operations like:

print(data+1)
print(data+data)

But some operations would fail like :

print(np.sum(data))

What's happening behind the scenes with this type of structure ?

6
  • 4
    No. Such an array is basically the same as a list, containing references to the component arrays. Commented Jan 30, 2022 at 18:01
  • 1
    Check this ;) numpy.org/devdocs/dev/internals.html if you want to know more about how the NumPy array is organized in memory. Commented Jan 30, 2022 at 18:06
  • My comment is basically a repeat of the accepted answer in your link. There's a difference between explaining what can be done, and suggesting such a use. Commented Jan 30, 2022 at 18:40
  • Thanks for your answers. I updated the question to make it more precise. Commented Jan 30, 2022 at 20:05
  • What was the sum error message? Commented Jan 30, 2022 at 20:28

1 Answer 1

2

Like a list, an object dtype array can contain objects of any kind. For example

In [6]: arr = np.array([1,"two",[1,2,3],np.array([4,5,6])], object)
In [7]: arr
Out[7]: array([1, 'two', list([1, 2, 3]), array([4, 5, 6])], dtype=object)

Look what happens when we do addition:

In [8]: arr+arr
Out[8]: 
array([2, 'twotwo', list([1, 2, 3, 1, 2, 3]), array([ 8, 10, 12])],
      dtype=object)
In [10]: arr*2
Out[10]: 
array([2, 'twotwo', list([1, 2, 3, 1, 2, 3]), array([ 8, 10, 12])],
      dtype=object)

For list and strings, these operations are defined as 'join/replication'. It's in effect doing [x.__add__(x) for x in arr]. where __add__ is the class specific operation.

np.exp doesn't work because it tries to do [x.exp() for in arr], and almost noone defines an exp method.

In [11]: np.exp(arr)
AttributeError: 'int' object has no attribute 'exp'

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "<ipython-input-11-16c1c90aa297>", line 1, in <module>
    np.exp(arr)
TypeError: loop of ufunc does not support argument 0 of type int which has no callable exp method
Sign up to request clarification or add additional context in comments.

1 Comment

The explanation is super clear. Thanks a lot !

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.