Access attribute of elements within numpy array

Question

I have a numpy array full of objects (dtype=object) of the cftime class.

In [1]: a
Out[1]: 
array([cftime.DatetimeNoLeap(2000, 1, 1, 11, 29, 59, 999996, 5, 1),
       cftime.DatetimeNoLeap(2000, 1, 2, 11, 29, 59, 999996, 6, 2),
       cftime.DatetimeNoLeap(2000, 1, 3, 11, 29, 59, 999996, 0, 3)],
      dtype=object)

In [2]: type(a[0])
Out[2]: cftime._cftime.DatetimeNoLeap

Each of these objects has an attribute month.

a[0].month
Out[66]: 1

I'd like to get a new numpy array with the same shape, but filled with this attribute for each of the elements of the original array. Something like b=a.month. But obviously this fails, as a is a numpy array without month attribute. How can I achieve this result?

PS: of course I could do this with a plain Python loop, but I'd like to follow a fully numpy approach:

b=np.zeros_like(a, dtype=int)
for i in range(a.size):
    b[i] = a[i].month

Not a numpy answer but short of that you should use a loop/list comprehension. You can create a list by saying list = [ele] * n , but the elements all reference the same memory space - modifying any of them will affect the others. Loop/list comprehension avoids this. — KuboMD
– KuboMD, Commented Jan 15, 2019 at 13:36
Why the object array instead of a list? It's not any faster or easier. — hpaulj
– hpaulj, Commented Jan 15, 2019 at 16:30
Not my choice. This is how I get the data from a preliminary call to the num2date function of the the cftime package. — Pythonist
– Pythonist, Commented Jan 16, 2019 at 9:46
cftime is written in cython (Python compiled to c (as much as possible)). So make sure you use its own functionality as much as possible. — hpaulj
– hpaulj, Commented Jan 16, 2019 at 17:06

yatu · Accepted Answer · 2019-01-15 13:38:56Z

4

You can use np.vectorize, in order to map a function to every element in the array. For this case you can define a custom lambda function to extract the month of each entry lambda x: x.month:

np.vectorize(lambda x: x.month)(a)
array([1, 1, 1])

answered Jan 15, 2019 at 13:38

yatu

88.6k12 gold badges93 silver badges148 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

hpaulj Over a year ago

Using np.frompyfunc might be faster. vectorize uses it, but tends to be slower.

yatu Over a year ago

Thanks for your comment, will give it a look :-)

yatu Over a year ago

Did it help @Onturenio ? Don't forget to upvote/accept the answer if it did, thanks!

Pythonist Over a year ago

I tried it and it worked, but I also read about the fact that vectorize is pretty much a wrapper for a loop, so not really a numpy-performance approach. But I acknowledge that your solution works, so I'll accept it as solved. Still, I'll try to research other options that might be faster, perhaps the frompyfunc is the way to proceed.

yatu Over a year ago

Yes, that's right. I do not think this can be vectorized, after all numpy is not really a tool to work with datetime objects or similar. So you'll have to use something similar to a map in standart python, and in numpy you can either use vectorize or frompyfunc as @hpaulj suggestes

|

hpaulj · Accepted Answer · 2019-01-16 17:29:53Z

I don't have cftime installed, so will demonstrate with regular datetime objects.

First make an array of datetime objects - the lazy way using numpy's own datetime dtype:

In [599]: arr = np.arange('2000-01-11','2000-12-31',dtype='datetime64[D]')
In [600]: arr.shape
Out[600]: (355,)

Make an object dtype array from that:

In [601]: arrO = arr.astype(object)

and a list of datetimes as well:

In [602]: alist = arr.tolist()

Timing for regular list comprehension:

In [603]: timeit [d.month for d in alist]
20.1 µs ± 62.7 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

List comprehension on a object dtype array is usually a bit slower (but faster than a list comprehension on a regular array):

In [604]: timeit [d.month for d in arrO]
30.7 µs ± 266 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

frompyfunc - here it's slower; other times I've see it 2x faster than a list comprehension:

In [605]: timeit np.frompyfunc(lambda x: x.month, 1,1)(arrO)
51 µs ± 32.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

vectorize is (nearly) always slower than frompyfunc (even though it uses frompyfunc for the actual iteration):

In [606]: timeit np.vectorize(lambda x: x.month, otypes=[int])(arrO)
76.7 µs ± 123 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Here are samples of the arrays and list:

In [607]: arr[:5]
Out[607]: 
array(['2000-01-11', '2000-01-12', '2000-01-13', '2000-01-14',
       '2000-01-15'], dtype='datetime64[D]')
In [608]: arrO[:5]
Out[608]: 
array([datetime.date(2000, 1, 11), datetime.date(2000, 1, 12),
       datetime.date(2000, 1, 13), datetime.date(2000, 1, 14),
       datetime.date(2000, 1, 15)], dtype=object)
In [609]: alist[:5]
Out[609]: 
[datetime.date(2000, 1, 11),
 datetime.date(2000, 1, 12),
 datetime.date(2000, 1, 13),
 datetime.date(2000, 1, 14),
 datetime.date(2000, 1, 15)]

frompyfunc and vectorize are best used when you want the generality of broadcasting and multidimensional arrays. For 1d arrays, a list comprehension is nearly always better.

To fairer to frompyfunc, I should return an array from the list comprehension:

In [610]: timeit np.array([d.month for d in arrO])
50.1 µs ± 36.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

To get the best speed with dates in numpy, use the datatime64 dtype instead of object dtype. This makes more use of compiled numpy code.

In [611]: timeit arr = np.arange('2000-01-11','2000-12-31',dtype='datetime64[D]'
     ...: )
3.16 µs ± 51 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [616]: arr.astype('datetime64[M]')[::60]
Out[616]: 
array(['2000-01', '2000-03', '2000-05', '2000-07', '2000-09', '2000-11'],
      dtype='datetime64[M]')

Thanks for your comprehensive answer. Just what I needed as I am dealing with datetime64.

Collectives™ on Stack Overflow

Access attribute of elements within numpy array

2 Answers 2

6 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related