3

First off I will apologize to the arbitraryness of this question but I am rewriting some of my scripts to use Numpy arrays instead of nested python lists (for performance and memory) but I'm still struggling with their declaration.

I am trying to create a structure using numpy arrays, I am starting off with 1000 (arbitrary value) elements in the array where each element should contain a float (as [x][0]) and a nested array containing coordinates (so 10.0000 x 2 floats PER top level element) (as [x][1], with each element in the nested array accessible as [x][1][y][z] where y is the element in nested array and z specified which of the 2 coordinates). The following question Nested Structured Numpy Array creates a nigh identical structure (as reference for my question and my desired structure).

Schematic raw data example:

time 0
  m/z 10 int 10
  m/z 20 int 20
  m/z 30 int 1000
  ...
time 1
  <repeat>

I have read that i haveto use the dtype part to define the nested array but I am not quite sure on the declaration part of the dimensions for an empty array, could anyone give me a hand? Here is what I came up with so far.

data=np.zeroes((1000,2 /* Now add nested array */), dtype=[('time', 'f'), [('m/z','f'), ('intensity','f')]])

PS: A matrix might be a better option for this?

3
  • What does y mean in [x][1][y][z] ? Commented Apr 5, 2013 at 9:07
  • the element of the nested array. I made a typo in the OP (x should read y, let me fix that). Commented Apr 5, 2013 at 9:10
  • You may try a pandas dataframe instead. Commented Apr 5, 2013 at 9:27

1 Answer 1

6
>>> a = np.zeros(1000, dtype='float32, (10000,2)float32')
>>> a[200][0]
0.0
>>> a[200][1][2000]
array([ 0.,  0.], dtype=float32)

Note that this creates 1000 arrays of dimension (10000,2). That's fine if you only ever do operations that look at just one of those arrays. Using a separate (1000,10000,2) array instead you could take better advantage of vectorized operations in NumPy. You could for example increment all the second coordinates in one operation like this:

>>> b = np.zeros((1000,10000,2))
>>> b[:,:,1] += 1

Trying to do the same with a[:][1][:,1] is an error.

Sign up to request clarification or add additional context in comments.

1 Comment

@BasJansen You could also consider two individual arrays, one for 1000 floats and the other with shape (1000,10000,2).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.