0

How do I properly combine two numpy ndarrays that have named fields and are the same length into one ndarray? In my example below I would like to combine xnd and ynd into a single numpy ndarray.

I know how to create a new ndarray from the concatenated dtype of xnd and ynd and then iteratively copy the contents from xnd and ynd into that new ndarray. But is there a numpy command that will do this for me?

The fastest and simplest way of combining xnd and ynd would be ideal. Perhaps appending ynd to xnd inplace rather than making a copy? This solution needs to work fast with large ndarrays.

I have seen several examples on how to combine simple n dimensional numpy arrays, but they haven't helped me with this problem. The line with znd = np.join((xnd, ynd)) at the bottom of my example is where I get stuck.

Thanks!

import numpy as np

n = 10

t = np.arange(n)
abc = np.array((t,t+n,t+2*n)).T
y = (t*10).astype(np.uint32) 


# Create x ndarray
xdt = np.dtype([
    ('t', np.float64),
    ('abc', (np.float32, 3) )
    ])
xnd = np.ndarray( shape=n, dtype=xdt)
xnd['t'] = t
xnd['abc'] = abc

# Create y ndarray
ydt = np.dtype([
    ('y', np.uint32),
    ])
ynd = np.ndarray( shape=n, dtype=ydt)
ynd['y'] = y


print xnd.dtype
# [('t', '<f8'), ('abc', '<f4', (3,))]
print ynd.dtype
# [('y', '<u4')]


# Combine x and y
# This line not correct.  What is the proper way to do this?
znd = np.join((xnd, ynd)) 

print znd.dtype
# [('t', '<f8'), ('abc', '<f4', (3,)), ('y', '<u4')]
1

1 Answer 1

0

Here's what the recarray functions do - copy fields by name:

In [10]: zdt=[('t', '<f8'), ('abc', '<f4', (3,)), ('y', '<u4')]
In [11]: znd=np.zeros(xnd.shape, dtype=zdt)

In [12]: for name in xnd.dtype.names:
   ....:     znd[name]=xnd[name]
   ....:     

In [13]: for name in ynd.dtype.names:
    znd[name]=ynd[name]
   ....:     

In [14]: znd
Out[14]: 
array([(0.0, [0.0, 10.0, 20.0], 0L), (1.0, [1.0, 11.0, 21.0], 10L),
       (2.0, [2.0, 12.0, 22.0], 20L), (3.0, [3.0, 13.0, 23.0], 30L),
       (4.0, [4.0, 14.0, 24.0], 40L), (5.0, [5.0, 15.0, 25.0], 50L),
       (6.0, [6.0, 16.0, 26.0], 60L), (7.0, [7.0, 17.0, 27.0], 70L),
       (8.0, [8.0, 18.0, 28.0], 80L), (9.0, [9.0, 19.0, 29.0], 90L)], 
      dtype=[('t', '<f8'), ('abc', '<f4', (3,)), ('y', '<u4')])

Since normally the number of records is much larger than the number of fields, this iteration is not expensive.

There may be function that creates the union zdt from the individual ones, but I'm not going to dig around for that now.

There is a function that does the field copy recursively. That's needed if the dtype is nested - fields with compound dtypes.

You can also create a new array from a list of tuples - one tuple per record. Here I am using zip() to iterate on the 2 arrays, and joining their records with tuple concatenation.

np.array([tuple(x)+tuple(y) for x,y in zip(xnd,ynd)],dtype=zdt)

I expect this to be slower, at least when there more rows than fields.


Since there's no overlap in fields in this case, the new dtype can be made by just concatating dtye.descr of the dtypes. descr is a list; one list can be joined to another.

In [26]: np.dtype(xnd.dtype.descr+ynd.dtype.descr)
Out[26]: dtype([('t', '<f8'), ('abc', '<f4', (3,)), ('y', '<u4')])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.