2
In [90]: list_dates = [datetime.date(2014,2,2),datetime.date(2015,2,2), datetime.date(2013,4,5)]

In [91]: df = DataFrame(list_dates, columns=['Date'])

In [92]: df
Out[92]: 
         Date
0  2014-02-02
1  2015-02-02
2  2013-04-05

Now I want to get a new DataFrame with only the dates that are from years 2014 and 2013:

In [93]: result = DataFrame([date for date in df.Date if date.year in (2014,2013)])

In [94]: result
Out[94]: 
            0
0  2014-02-02
1  2013-04-05

That works and gives me the desired DataFrame. Why doesn't the following work:

In [95]: result1 = df[df.Date.map(lambda x: x.year) == 2014 or p.Date.map(lambda x: x.year) == 2013]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-95-86f01906c89b> in <module>()
----> 1 result1 = df[df.Date.map(lambda x: x.year) == 2014 or p.Date.map(lambda x: x.year) == 2013]

/home/marcos/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in __nonzero__(self)
    690         raise ValueError("The truth value of a {0} is ambiguous. "
    691                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 692                          .format(self.__class__.__name__))
    693 
    694     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Or the following:

In [96]: df['year'] = df.Date.map(lambda x: x.year)

In [97]:     result2 = df[df.year in (2014, 2013)]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-97-814358a4edff> in <module>()
----> 1 result2 = df[df.year in (2014, 2013)]

/home/marcos/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in __nonzero__(self)
    690         raise ValueError("The truth value of a {0} is ambiguous. "
    691                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 692                          .format(self.__class__.__name__))
    693 
    694     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I think that the problem is that when I use the 'in' command I am trying to check if a whole Series is in a tuple. But how do I make the evaluation be elementwise so that I get the result I want?

1 Answer 1

3

I'd convert the dates to datetime objects using to_datetime, this then allows you to use the dt accessor to access the year attribute and we can then call isin and pass a list of years of interest to filter the df:

In [68]:

df['Date'] = pd.to_datetime(df['Date'])
In [69]:

df[df['Date'].dt.year.isin([2013,2014])]
Out[69]:
        Date
0 2014-02-02
2 2013-04-05
Sign up to request clarification or add additional context in comments.

4 Comments

When I run the commands again together with your suggestion I get: AttributeError: 'Series' object has no attribute 'dt' I ran: In [118]: list_dates = [datetime.date(2014,2,2),datetime.date(2015,2,2), datetime.date(2013,4,5)] In [119]: df = DataFrame(list_dates, columns=['Date']) In [120]: df['Date'] = pd.to_datetime(df['Date']) In [121]: df[df['Date'].dt.year.isin([2013,2014])]
You must be using an older version of pandas, what version are you using? Can you upgrade?
In [123]: pd.__version__ Out[123]: '0.14.1'
This was added in 0.15.0 can you upgrade?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.