3

I have a large list of objects that include numpy arrays as attributes. They each have methods that manipulate the array. I would like to create a single 2D numpy array that stores the other arrays and updates when the individual numpy arrays are manipulated.

This is easy to accomplish with lists as you simply need to create a list of references to other lists.

>>> x = [1,2,3]
>>> y = [4,5,6]
>>> z = [x,y] # stores reference to x and y
>>> x[0] = 10
>>> z
[[10,2,3],[4,5,6]]

However, doing the same in numpy creates copies of the object.

>>> x = np.array([1,2,3])
>>> y = np.array([4,5,6])
>>> z = np.array([x,y])  # setting the optional argument copy = False didn't help either
>>> id(x)
140673084678272
>>> id(z[0])
140673084678512

I guessed that setting copy = False wouldn't work because I'm passing a new concatenated list object which hadn't existed before. Is there a way to create z where it's elements are references to the numpy arrays x and y?

I recognize that references in numpy are typically accomplished with views, but this seems to be a different use case. Creating a view object of a single numpy array is fairly straightforward, but I'm unsure how to store in a numpy array, N view objects from N individual numpy arrays.

4
  • What's wrong with a list of those arrays? Commented Jan 22, 2020 at 17:12
  • 1
    The method which manipulates the arrays requires a numpy array, and the final method I'l be using on the concatenated array requires a numpy array. Commented Jan 22, 2020 at 17:22
  • You could make the new array, and then 'views' of the individual rows (discarding or ignoring the originals). Commented Jan 22, 2020 at 18:16
  • A view only works with the array can share the underlying databuffer - that is access the same data as the original, but with different shape and strides. Often that is a subset, but never any sort of superset of several databuffers. Commented Jan 22, 2020 at 18:17

2 Answers 2

1

Create an empty object array of the required length (it contains Nulls). Assign the list to a view of the whole array. Done. And yes, the original objects update when the individual numpy arrays are manipulated. Note that operations on the whole array, however, always seem to create copies even when you'd normally expect in-place operations.

import numpy as np
x = np.array([1,2,3])
y = np.array([4,5,6])
z = np.empty(2, dtype=object)
z[:] = [x,y]
print(id(x))
print(id(z[0]))

Output:

140293181519152
140293181519152
Sign up to request clarification or add additional context in comments.

Comments

0

One solution would be to stack() or concatenate() arrays i certain dimensions match. if dims don't match - use padding, store dimensions in separate list and then stack.

2 Comments

Those methods do combine the arrays in the way I'm looking for, but after some testing, it seems they do it by making copies, i.e. after combining them, changing the individual arrays doesn't affect the combined array. I'm wondering if there exists a stack or concatenate which combines the references or view objects rather than a new copy of the array.
There is numpy object dtype, but I view that as a bastardized list. It is created most often by mistake as in np.array([[1,2,3], [4,5]]). Math on it hit-or-miss and slower than on a regular numeric array, and iteration is slower than on a list.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.