How to have a numpy array with mixed types?

Question

I would like to create a numpy array with mixed types. The other SO questions that I found either create an object based array or an nested array.

Both I do not want.

How would the syntax look like to have a numpy array with one str and two int columns?

This is my present code:

import numpy as np

b = np.empty((0, 3), )
b = np.insert(b, b.shape[0], [[1, 2, 3]], axis=0)
b = np.insert(b, b.shape[0], [[1, 2, 3]], axis=0)

print(b)
print("---")

a = np.empty((0, 3), dtype='S4, int, int')
a = np.insert(a, a.shape[0], ("a", 2, 3), axis=0)
a = np.insert(a, a.shape[0], ("a", 2, 3), axis=0)

print(a)

The output:

[[1. 2. 3.]
 [1. 2. 3.]]
---
[[(b'a', 2, 3) (b'a', 2, 3) (b'a', 2, 3)]
 [(b'a', 2, 3) (b'a', 2, 3) (b'a', 2, 3)]]

EDIT:

And what I need for the array a is:

[["a" 2 3]
 ["a" 2 3]]

np.array([('a', 1, 2), ('b', 2, 3)], dtype=np.dtype('S4, int, int')) — alkasm
– alkasm, Commented Oct 11, 2018 at 4:16
Sorry, my question was completely misleading. I hope that it is clearer now. — user9098935
– user9098935, Commented Oct 11, 2018 at 7:25

hpaulj · Accepted Answer · 2018-10-11 16:04:57Z

2

Your second array is close, though I'd do it with indexing rather than insert (which is slower):

In [431]: a = np.zeros(3, dtype='S4, int, int')
In [432]: a[0] = ('a', 2, 3)
In [433]: a[1] = 1
In [434]: a
Out[434]: 
array([(b'a', 2, 3), (b'1', 1, 1), (b'', 0, 0)],
      dtype=[('f0', 'S4'), ('f1', '<i8'), ('f2', '<i8')])

A list of tuples is also a good way of constructing such an array:

In [436]: a = np.array([('a',2,3),('b',4,5)], dtype='S4, int, int')
In [437]: a
Out[437]: 
array([(b'a', 2, 3), (b'b', 4, 5)],
      dtype=[('f0', 'S4'), ('f1', '<i8'), ('f2', '<i8')])

Note that the shape is 1d (n,), with 3 fields. The fields don't count as a dimension.

Fields are accessed by name, not 'column' number:

In [438]: a['f1']
Out[438]: array([2, 4])

You made a (2,3) array, and filled each 'row' with the same thing. That's why you have repeats, while I don't.

With a unicode string dtype (default for Py3):

In [439]: a = np.array([('a',2,3),('b',4,5)], dtype='U4, int, int')
In [440]: a
Out[440]: 
array([('a', 2, 3), ('b', 4, 5)],
      dtype=[('f0', '<U4'), ('f1', '<i8'), ('f2', '<i8')])
In [441]: print(a)
[('a', 2, 3) ('b', 4, 5)]

edited Oct 11, 2018 at 16:04

answered Oct 11, 2018 at 4:21

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user9098935 Over a year ago

Sorry, my question was completely misleading. I hope that it is clearer now.

hpaulj Over a year ago

What's changed. My answer gives you what you want, just replacing 'columns' with 'fields'.

hpaulj Over a year ago

If you used 'U4' instead of 'S4' you wouldn't get the b'a' notation.

user9098935 Over a year ago

Thanks for your input. So my intention is that I can add entries to the array on demand. E.g. also via a loop. These parts of the array that are of type int should be able to be processed with [numba] ( numba.pydata.org ). Maybe it is better to leave this stackoverflow question as it is for the moment and create a new question where I state my requirements more precisely at the beginning?

Collectives™ on Stack Overflow

How to have a numpy array with mixed types?

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related