2

If I want to create a numpy array with dtype = [('index','<u4'),('valid','b1')], and I have separate numpy arrays for the 32-bit index and boolean valid values, how can I do it?

I don't see a way in the numpy.ndarray constructor; I know I can do this:

arr = np.zeros(n, dtype = [('index','<u4'),('valid','b1')])
arr['index'] = indices
arr['valid'] = validity

but somehow calling np.zeros() first seems wrong.

Any suggestions?

1
  • There's nothing wrong with filling in the 'columns' like this. The only alternative is to give it a list of tuples, [(i[0],v[0),(i[1],v[1])...]. Commented Dec 31, 2014 at 22:19

2 Answers 2

3

An alternative is

arr = np.fromiter(zip(indices, validity), dtype=[('index','<u4'),('valid','b1')])

but I suspect your initial idea is more efficient. (In your approach, you could use np.empty() instead of np.zeros() for a tiny performance benefit.)

Sign up to request clarification or add additional context in comments.

Comments

1

Just use empty instead of zeros, and it should feel less 'wrong', since you are just allocating the data without unnecessarily zeroing it.

Or use fromiter, and pass in also the optional count argument if you're keen on performance.

This is in any case a matter of taste in more than 99% of the use cases, and won't lead to any noticeable performance improvements IMHO.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.