creating numpy structured arrays from columns

Question

If I want to create a numpy array with dtype = [('index','<u4'),('valid','b1')], and I have separate numpy arrays for the 32-bit index and boolean valid values, how can I do it?

I don't see a way in the numpy.ndarray constructor; I know I can do this:

arr = np.zeros(n, dtype = [('index','<u4'),('valid','b1')])
arr['index'] = indices
arr['valid'] = validity

but somehow calling np.zeros() first seems wrong.

Any suggestions?

There's nothing wrong with filling in the 'columns' like this. The only alternative is to give it a list of tuples, [(i[0],v[0),(i[1],v[1])...]. — hpaulj
– hpaulj, Commented Dec 31, 2014 at 22:19

Warren Weckesser · Accepted Answer · 2014-12-31 20:38:14Z

3

An alternative is

arr = np.fromiter(zip(indices, validity), dtype=[('index','<u4'),('valid','b1')])

but I suspect your initial idea is more efficient. (In your approach, you could use np.empty() instead of np.zeros() for a tiny performance benefit.)

answered Dec 31, 2014 at 20:38

Warren Weckesser

116k20 gold badges207 silver badges224 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

gg349 · Accepted Answer · 2014-12-31 23:41:48Z

1

Just use empty instead of zeros, and it should feel less 'wrong', since you are just allocating the data without unnecessarily zeroing it.

Or use fromiter, and pass in also the optional count argument if you're keen on performance.

This is in any case a matter of taste in more than 99% of the use cases, and won't lead to any noticeable performance improvements IMHO.

answered Dec 31, 2014 at 23:41

gg349

22.8k5 gold badges58 silver badges65 bronze badges

Collectives™ on Stack Overflow

creating numpy structured arrays from columns

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest