3

This is my code,

import plotly.plotly as py
import datetime
import pandas
import matplotlib.pyplot as plt
import pandas.io.data as pd


start = datetime.datetime(2016, 2, 1)
end   = datetime.datetime(2016, 2, 11)
#raw = pd.DataReader("tjx", "yahoo", start, end)
rawy = pd.DataReader("tjx", "yahoo", start, end)['Low']

print rawy
print "========================"

columns = ['Low']
newDf = pd.DataFrame(columns=columns)
newDf = newDf.fillna(0)

#newDf[0] = rawy[0]
#newDf[0:1] = rawy[0:1]
#newDf.loc[0] = rawy.loc[0]
newDf.loc[0] = rawy[0]
print newDf

The result is like this,

Date
2016-02-01    70.470001
2016-02-02    72.309998
2016-02-03    71.000000
2016-02-04    69.720001
2016-02-05    67.900002
2016-02-08    66.820000
2016-02-09    67.519997
2016-02-10    69.279999
2016-02-11    67.410004
Name: Low, dtype: float64
========================
         Low
0  70.470001

If you look at the last line of result, it's using 0 as index, not date from the original data frame. So how to correct this please ?

2 Answers 2

2

If you want the index to come over, you've got to assign it. Here's two methods that seem to work:

>>> newDf = pd.DataFrame(data=[rawy[0]], index=[rawy.index[0]], columns=columns)
>>> newDf
                  Low
2016-02-01  70.470001

or

>>> newDf = pd.DataFrame(rawy.head(1))
>>> newDf
                   Low
 Date
 2016-02-01  70.470001
Sign up to request clarification or add additional context in comments.

1 Comment

This creates the new dataframe. Now how do you extend it for a new value?
1

It is using zero as the index because that is the value you assigned to it. Try this instead.

newDf = pd.DataFrame(columns=columns)
>>> newDf
Empty DataFrame
Columns: [Low]
Index: []

newDf.ix[rawy.index[0]] = rawy[0]  # Or newDf.loc[rawy.index[0]] = rawy[0]
newDf.ix[rawy.index[1]] = rawy[1]

>>> newDf
                  Low
2016-02-01  70.470001
2016-02-02  72.309998

6 Comments

Thanks so much for the quick answer !
Huh, I'm actually getting a "*** KeyError: Timestamp('2016-02-01 00:00:00', tz=None)" if I tack this at the bottom of the code in the question. Didn't think you could use .ix if you haven't assigned an index to newDf yet?
In fairness, I used import pandas.io.data as web and rawy = web.DataReader("tjx", "yahoo", start, end)['Low']. The date index was assigned automatically. I'm not sure if the functionality is different with pd.DataReader. Also, I am using Pandas 0.17.1. Which version are you using?
I'm on Pandas 0.12.0, so yep, could be a versioning thing. rawy in my example had the index automatically assigned, but newDf, via "pd.DataFrame(columns=columns)" definitely did not. In your example aren't you saying "assign this value to the item with index of rawy.index[0] (a Timestamp) in newDf? If no index is explicitly created for newDf then how does that work? Noting here that I'm probably intermediate level on pandas - mostly asking to see if I missed something.
I'm saying that newDf with an index value of [timestamp] is assigned the value of rawy[0]. If this value does not yet exist in the index, then it will be created.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.