4

I am having a hard time gettingaround this as I couldn't see anyone that have had the same issue before on google. I am a total noob so bear with me!:)))

import pandas as pd
#import quandl



#df=quandl.get('WIKI/GOOGL')

#df.to_csv('google.csv')
#df=pd.read_csv('google.csv')
df = pd.read_csv(r'C:\Users\c900452\Downloads\20160623 Python\google.csv')



df=df[['Adj. Open','Adj. High','Adj. Low','Adj. Close','Adj. Volume']]

# crude volatility
df['HL_PCT'] = (df['Adj. High'] -df['Adj. Low'])/df['Adj. Close']*100.0

#close and open volatility

df['PCT_change'] = (df['Adj. Close'] - df['Adj. Open']) / df['Adj. Open'] * 100.0

#creating a new dataframe
df = df[['Adj. Close', 'HL_PCT', 'PCT_change', 'Adj. Volume']]

import math
import numpy as np
import pandas as pd
from sklearn import preprocessing, cross_validation, svm
from sklearn.linear_model import LinearRegression



forecast_col = 'Adj. Close'
df.fillna(value = -99999, inplace=True)
forecast_out = int(math.ceil(0.01 * len(df)))
print(forecast_out)

df['label'] = df[forecast_col].shift(-forecast_out)


X = np.array(df.drop(['label'],1))
X = preprocessing.scale(X)
X_lately = X[-forecast_out:]
X = X[:-forecast_out]




df.dropna(inplace=True)

y = np.array(df['label'])
y = np.array(df['label'])


X_train, X_test, y_train, y_test = cross_validation.train_test_split(X,y,test_size=0.2)

clf=LinearRegression(n_jobs=-1)
clf.fit(X_train,y_train)
accuracy = clf.score(X_test,y_test)

forecast_set = clf.predict(X_lately)

print(forecast_set, accuracy, forecast_out)



import datetime
import matplotlib.pyplot as plt
from matplotlib import style

style.use('ggplot')

df['Forecast'] = np.nan

last_date = df.iloc[-1].name
last_unix  = last_date.timestamp()
one_day = 86400
next_unix = last_unix+one_day

for i in forecast_set:
    next_date = datetime.datetime.fromtimestamp(next_unix)
    next_unix += one_day
    df.loc[next_date] = [np.nan for _ in range(len(df.columns)-1)]+[i]

print(df.head())

df['Adj. Close'].plot()    
df['Forcast'].plot()
plt.legend(loc=4)
plt.xlabel('Date')
plt.xlabel('Price')
plt.show()

And I am getting the error stated in the topic, why?

5
  • I presume this is giving error last_unix = last_date.timestamp() ? Can you give more info about the error? [like line number] Commented Jul 23, 2016 at 19:46
  • Hi you are absolutely right: Anaconda/Regression2.py", line 87, in <module> last_unix = last_date.timestamp() Commented Jul 23, 2016 at 19:56
  • What is the value of last_date? It looks like type(last_date) is numpy.int64, which is a class that does not have a timestamp method. Commented Jul 23, 2016 at 20:22
  • Your code last_unix = last_date.timestamp() expects last_date to be a datetime object, whereas it is a numpy.int64 object [with value 2955]. Does that help? Commented Jul 25, 2016 at 9:33
  • When you save your Quandle / Dataframe object as csv you lose all its functionality. The columns with dates are no longer datetime objects in de df, but str. You should save the dataframe as pickle not as CSV. df.to_pickle(file_name) Commented Jan 8, 2018 at 11:29

5 Answers 5

3

Try parsing the Date column when you read the file:

df = pd.read_csv(r'C:\Users\c900452\Downloads\20160623 Python\google.csv',
                  header=0, 
                  index_col='Date',
                  parse_dates=True)

It worked for me. For more details read pandas.read_csv Documentation

Sign up to request clarification or add additional context in comments.

Comments

1

Since you're not getting the data directly from Quandl but from your local directory, you have to set 'parse_dates=True' when reading the csv file.

It should be as follows:

data = quandl.get('WIKI/GOOGL')
data.to_csv('googl.csv')
df = pd.read_csv('googl.csv', index_col='Date', parse_dates=True)

This will solve your problem.

Comments

0

Use last_unix = time.mktime(last_date.timetuple()) instead of last_date.timestamp().

Comments

0

It seems it doesn't index the rows by the dates. So when you are trying to get last_date, actually it is getting int instead of date.

As per my understanding you can add date index by using the following line after reading csv code - df.set_index('date', inplace=True)

After making the change you might need to change last_unix = last_date.timestamp() line.

Or you can try to read CSV by using quandl and try implement in this way df = quandl.get_table('WIKI/PRICES', ticker='GOOGL')

I hope it will be helpful but I am not 100% sure as I did not test the code.

Comments

0

Try to comment this code:

last_unix  = last_date.timestamp()

Instead try to use last_date variable directly without applying timestamp() method on last_date

next_unix = last_date + one_day

it's seems like heck but i just wanted to see the graph, it worked.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.