0

I am trying to select some rows between two dates inside a Dataframe. The problem is when I try, I get:

Empty DataFrame

I import some financial historical data and then puting the date column as the index (DatetimeIndex).

When I try to individually select one row with a date, it works. It's when I try with a date interval that it doesn't (even if I checked each row individually).

I tried to fill possible empty cells with fillna(), without success.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from datetime import datetime

#Open Euro Euro Stoxx 50 csv file, rename columns and set dates as index

euro_stoxx_50 = pd.read_csv('STOXX50E.csv', parse_dates = True, index_col = 0)
euro_stoxx_50.columns = ['open', 'high', 'low', 'close', 'volume', 'adj close']
euro_stoxx_50.index.names = ['date']

My problem with examples:

print euro_stoxx_50.head() 
print euro_stoxx_50.index
print euro_stoxx_50.empty
print euro_stoxx_50['2012':'2015'].empty

Will give:

date         open     high      low    close    volume  adj close                                              
2015-09-25  3113.16  3113.16  3113.16  3113.16       0    3113.16
2015-09-24  3019.34  3019.34  3019.34  3019.34       0    3019.34
2015-09-23  3079.99  3079.99  3079.99  3079.99       0    3079.99
2015-09-22  3076.05  3076.05  3076.05  3076.05       0    3076.05
2015-09-21  3184.72  3184.72  3184.72  3184.72       0    3184.72

<class 'pandas.tseries.index.DatetimeIndex'>
[2015-09-25, ..., 1986-12-31]
Length: 7396, Freq: None, Timezone: None

False

True

And

print euro_stoxx_50['2012-9-12']
print euro_stoxx_50['2012-9-13']
print euro_stoxx_50['2012-9-12':'2012-9-13']

will give:

date        open    high     low   close  volume  adj close                                                        
2012-09-12  2564.8  2564.8  2564.8  2564.8       0     2564.8


date   open     high      low    close  volume  adj close                                                          
2012-09-13  2543.22  2543.22  2543.22  2543.22       0    2543.22

Empty DataFrame
Columns: [open, high, low, close, volume, adj close]
Index: []

edit

Thanks for any help!

5
  • I forget to say that the data goes from 1986 to 2015 (inclusive). Commented Sep 29, 2015 at 0:11
  • You didn't show the result for individual row access with a datetime and I don't have enough data to verify the issue for which STOXX50E.csv or a small portion of it would help. Anyway, have you tried selection with ix? Commented Sep 29, 2015 at 0:18
  • Yes sorry for that. I will put that Commented Sep 29, 2015 at 0:21
  • What is the value of euro_stoxx_50.index and euro_stoxx_50.index.names? Commented Sep 29, 2015 at 1:21
  • euro_stoxx_50.inde.names gives "[u'date']" and euro_stoxx_50.index gives <class 'pandas.tseries.index.DatetimeIndex'> [2015-09-25, ..., 1986-12-31] Length: 7396, Freq: None, Timezone: None Commented Sep 29, 2015 at 1:46

2 Answers 2

1

If I am understanding correctly you want to filter for rows where the date falls between two points. If so you can do so like this.

first = pd.to_datetime('2012-1-1')
last = pd.to_datetime('2015-1-1')

df[(df['date'] > first) & (df['date'] < last)]

edit: Since 'date' is the index you can use loc:

df.loc[first:last]
Sign up to request clarification or add additional context in comments.

5 Comments

When I tried this, I get "KeyError: 'date'". So I used "print df[(df > first) & (df < last)]" instead and it gave all of the dataframe, not the rows between the two specified dates.
Ah I missed that 'date' was the index. I'll edit it.
Even with this new solution I still get the same problem.
In order to perform operations on the dates you will need to convert them to datetime objects. @Tris Nefzger shows you how to do this in his answer.
Actually the dates in the index were already DatetimeIndex. Thanks for your help, my problem was just that I was not logical... (see my answer to @Tris Nefzger)
0

I find ix indexing using datetime strings works when the DataFrame is indexed with the date column Series. For example, given the following data in test.txt

date        open     high     low      close    volume    adj
2015-09-25  3113.16  3113.16  3113.16  3113.16       0    3113.16
2015-09-24  3019.34  3019.34  3019.34  3019.34       0    3019.34
2015-09-23  3079.99  3079.99  3079.99  3079.99       0    3079.99
2015-09-22  3076.05  3076.05  3076.05  3076.05       0    3076.05
2015-09-21  3184.72  3184.72  3184.72  3184.72       0    3184.72

import pandas as pd

df = pd.read_csv('test.txt', sep="\s+")
df['date'] = pd.to_datetime(df['date'])
df.set_index(['date',inplace=True])
df.ix['2015-09-25':'2015-09-22']
Out[15]: 
               open     high      low    close  volume      adj
date                                                           
2015-09-25  3113.16  3113.16  3113.16  3113.16       0  3113.16
2015-09-24  3019.34  3019.34  3019.34  3019.34       0  3019.34
2015-09-23  3079.99  3079.99  3079.99  3079.99       0  3079.99

5 Comments

Thanks for your help, this worked for me. I just had to filter using "['2015-09-25':'2015-09-22']" rather than "['2015-09-22':'2015-09-25']"... So simple…
@Baptiste: Pandas indexing is a nightmare. I just remember ix is for rows because Wes McKinney emphasizes it. ".ix is the most general indexer and will support any of the inputs in .loc and .iloc. .ix also supports floating point label schemes. .ix is exceptionally useful when dealing with mixed positional and label based hierachical indexes." - from pandas.pydata.org/pandas-docs/stable/generated/….
I get the same resulting using "euro_stoxx_50.ix['2008-12-31':'2008-1-1']['close']" or "euro_stoxx_50['2008-12-31':'2008-1-1']['close']". I just find 'unnatural' to write "['2008-12-31':'2008-1-1']" instead of "['2008-1-1':'2008-12-31']".
@Baptiste: I agree and found it odd that the data was in descending order by date and would probably sort the DataFrame to put it in ascending order.
Yes, I just did that, it works fine. I just believed pandas would sort automatically when selecting between two dates.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.