0

I have a number of timeseries data in arrays and wish to extract values between given dates in the simplest way possible avoiding loops. Here's an example:

from numpy import *
from datetime import *

# datetime array
date_a=array([
datetime(2000,1,1),
datetime(2000,1,2),
datetime(2000,1,3),
datetime(2000,1,4),
datetime(2000,1,5),
])

# item array, indices corresponding to datetime array
item_a=array([1,2,3,4,5])

# extract items in a certain date range
# after a certain date, works fine
item_b=item_a[date_a >= (datetime(2000,1,3))] #Out: array([3, 4, 5])

# between dates ?
item_c=item_a[date_a >= (datetime(2000,1,3)) and date_a <= (datetime(2000,1,4))]
# returns: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Is there a one-line solution to this? I have looked at numpy any() and all(), and also where(), without being able to find a solution. I appreciate any help and point-in-direction!

3 Answers 3

4

If you want one-liner, then you can use

item_c=item_a[(date_a >= (datetime(2000,1,3))) * (date_a <= (datetime(2000,1,4)))]
Sign up to request clarification or add additional context in comments.

2 Comments

Brilliant, just what I was searching for! Thanks :)
Just FYI: & is much more readable than *, here, and it does exactly the same thing.
3

It's not clear to me why you are using the item_a variable. But to isolate the entries you want you can simply do:

>>> np.where(np.logical_and(date_a >= datetime(2000,1,3), date_a <= datetime(2000,1,4)))
(array([2, 3]),)

The resulting indexes are zero-based, so they correspond to the third and fourth element of your array.

EDIT: np is due to import numpy as np. Doing from numpy import * is in fact a very bad idea. You will overwrite built in functions such as sum and abs for example...

HTH!

1 Comment

thanks! coming from matlab there is still plenty learn about python behaviour :) this answer and the one from @Andrey Sobolev was exactly what I was looking for. item_a was just to get the values of the array, but indeed not needed as it is the indices I'm interested in
1

I think the following should work for you using List Comprehension

[item_a[i] for i in xrange(0,len(date_a)) if date_a[i] >= (datetime(2000,1,3)) and date_a[i] <= (datetime(2000,1,4))]

Select all items in item_a within range 0 <= i < length of date_a where datetime(2000,1,3) <= date_a[i] <= datetime(2000,1,4)

1 Comment

Works like a charm! I'm trying to avoid loops though due to very large datasets, but I will test this implementation as well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.