0

I have multiple arrays without column or row names, and I would like to combine them using something like numpy.vstack() or numpy.hstack().

Column and row labels can be assigned can be done when creating a structured array, but hstack and vstack don't seem to have this functionality.

import numpy as np
a1 = np.array([1,2,3,4])
a2 = np.array([5,6,7,8])
a3 = np.vstack([a1,a2],dtype=[('RowName1','double'),('RowName2','double')])

yielding:

TypeError: vstack() got an unexpected keyword argument 'dtype'


Any suggestions?

3 Answers 3

2

Something possible option is (since recfunctions are pretty hidden):

from numpy.lib import recfunctions
a1 = np.array([1,2,3,4]).astype(('RowName1',float))
a2 = np.array([5,6,7,8]).astype(('RowName2',float))
recfunctions.merge_arrays((a1, a2))

Had this, but this has a few problems to be careful with, because of how reinterpretation of memory works with view, its better to just create a new recarray with the concatenated array.

you could just turn around the logic:

import numpy    
a1 = np.array([1,2,3,4])
a2 = np.array([5,6,7,8])
# ok, not that beautiful. But if your arrays are the correct type to begin with
# you can skip that astype call. Using `np.c_[]` since it happens to concatenate right.
a3 = np.c_[v1,v2].astype(float).copy('C').view(dtype=[('RowName1',float),('RowName2',float)])
Sign up to request clarification or add additional context in comments.

4 Comments

I don't think that works as the OP wanted. First, it will view integer arrays as double (ie, garbled), and it will have the wrong shape because the wanted index is by row and not by column.
@tiango, right :/... I didn't check that vstack wasn't really right to begin with, and without explicit cast to the right type, the view is bad of course.
it still gives a wrong result. If you look at your a3['RowName1'], it gives a 2D array: array([[ 1., 5.], [ 3., 7.]]). I'm not trying to criticise your answers, just genuinely curious. Even if a1, a2 are of the same type and concatenated in the right direction, it doesn't seem possible to get a view per row -- the shape is wrong.
@tiango yeah sorry, it was not a good idea with view. The reason is that the concatenation array is not C-Contiguous and view only reinterprets the underlying memory, but because it is not C-Contiguous that is not the same as the wanted result... Its actually a good example why one needs to be careful with view...
2

You might also consider looking at pandas. Pandas has a nice data frame data structure that might be good.

Of course, this requires you to add another dependency to your project. Luckily, if you are already using numpy then Pandas is pretty easy to get going.

Comments

1

vstack doesn't work with structured arrays, but only with 'standard' numpy arrays that are contiguous in memory. The easiest way is for you to create an empty structured array and then fill it up with the rows that you want:

import numpy as np
a3 = np.empty(4, dtype=[('RowName1','double'),('RowName2','double')])
a3['RowName1'] = a1
a3['RowName2'] = a2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.