How to create pandas dataframe with date as index

Question

This is my code,

import plotly.plotly as py
import datetime
import pandas
import matplotlib.pyplot as plt
import pandas.io.data as pd


start = datetime.datetime(2016, 2, 1)
end   = datetime.datetime(2016, 2, 11)
#raw = pd.DataReader("tjx", "yahoo", start, end)
rawy = pd.DataReader("tjx", "yahoo", start, end)['Low']

print rawy
print "========================"

columns = ['Low']
newDf = pd.DataFrame(columns=columns)
newDf = newDf.fillna(0)

#newDf[0] = rawy[0]
#newDf[0:1] = rawy[0:1]
#newDf.loc[0] = rawy.loc[0]
newDf.loc[0] = rawy[0]
print newDf

The result is like this,

Date
2016-02-01    70.470001
2016-02-02    72.309998
2016-02-03    71.000000
2016-02-04    69.720001
2016-02-05    67.900002
2016-02-08    66.820000
2016-02-09    67.519997
2016-02-10    69.279999
2016-02-11    67.410004
Name: Low, dtype: float64
========================
         Low
0  70.470001

If you look at the last line of result, it's using 0 as index, not date from the original data frame. So how to correct this please ?

Robert Rodkey · Accepted Answer · 2016-02-13 23:55:31Z

2

If you want the index to come over, you've got to assign it. Here's two methods that seem to work:

>>> newDf = pd.DataFrame(data=[rawy[0]], index=[rawy.index[0]], columns=columns)
>>> newDf
                  Low
2016-02-01  70.470001

or

>>> newDf = pd.DataFrame(rawy.head(1))
>>> newDf
                   Low
 Date
 2016-02-01  70.470001

answered Feb 13, 2016 at 23:55

Robert Rodkey

4233 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Alexander Over a year ago

This creates the new dataframe. Now how do you extend it for a new value?

Alexander · Accepted Answer · 2016-02-14 00:55:26Z

1

It is using zero as the index because that is the value you assigned to it. Try this instead.

newDf = pd.DataFrame(columns=columns)
>>> newDf
Empty DataFrame
Columns: [Low]
Index: []

newDf.ix[rawy.index[0]] = rawy[0]  # Or newDf.loc[rawy.index[0]] = rawy[0]
newDf.ix[rawy.index[1]] = rawy[1]

>>> newDf
                  Low
2016-02-01  70.470001
2016-02-02  72.309998

edited Feb 14, 2016 at 0:55

answered Feb 13, 2016 at 23:45

Alexander

111k32 gold badges212 silver badges208 bronze badges

6 Comments

user3552178 Over a year ago

Thanks so much for the quick answer !

Robert Rodkey Over a year ago

Huh, I'm actually getting a "*** KeyError: Timestamp('2016-02-01 00:00:00', tz=None)" if I tack this at the bottom of the code in the question. Didn't think you could use .ix if you haven't assigned an index to newDf yet?

Alexander Over a year ago

In fairness, I used import pandas.io.data as web and rawy = web.DataReader("tjx", "yahoo", start, end)['Low']. The date index was assigned automatically. I'm not sure if the functionality is different with pd.DataReader. Also, I am using Pandas 0.17.1. Which version are you using?

Robert Rodkey Over a year ago

I'm on Pandas 0.12.0, so yep, could be a versioning thing. rawy in my example had the index automatically assigned, but newDf, via "pd.DataFrame(columns=columns)" definitely did not. In your example aren't you saying "assign this value to the item with index of rawy.index[0] (a Timestamp) in newDf? If no index is explicitly created for newDf then how does that work? Noting here that I'm probably intermediate level on pandas - mostly asking to see if I missed something.

Alexander Over a year ago

I'm saying that newDf with an index value of [timestamp] is assigned the value of rawy[0]. If this value does not yet exist in the index, then it will be created.

|

Collectives™ on Stack Overflow

How to create pandas dataframe with date as index

2 Answers 2

1 Comment

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related