1

I've gone around in circles on this one. A bit frustrating as the solution is probably close at hand.

Anyway, I found a URL that returns some data in CSV format. However, the URL itself does not contain the csv file name. In a web browser, I can easily go to the link and them I'm asked whether I want to open or save the file. So, ultimately I know I'm getting a csv file with a name. I'm just not sure how to execute the task in python as there seems to be some intermediate data type being passed (bytes)

I've tried the following to no avail:

import urllib
import io
import pandas as pd
link = r'http://www.cboe.com/products/vix-index-volatility/vix-options-and-futures/vix-index/vix-historical-data/'
f = urllib.request.urlopen(link)
myfile = f.read()
buf = io.BytesIO(myfile)  # originally tried io.StringIO(myfile) but then realized myfile is in bytes
df = pd.read_csv(buf)

Any suggestions?

The df should contain data that looks similar to:

1/5/2004,18.45,18.49,17.44,17.49 1/6/2004,17.66,17.67,16.19,16.73 1/7/2004,16.72,16.75,15.5,15.5 1/8/2004,15.42,15.68,15.32,15.61 1/9/2004,16.15,16.88,15.57,16.75 1/12/2004,17.32,17.46,16.79,16.82

Here is the last line of the error message:

ParserError: Error tokenizing data. C error: Expected 2 fields in line 24, saw 4

2
  • 1
    Isn't the url for the csv file http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vixcurrent.csv I opened the link in your code in a browser and in the terminal and it is a standard webpage. Commented Apr 7, 2020 at 0:40
  • @EricTruett You're correct. The actual link can be found when inspecting the web page. Commented Apr 7, 2020 at 17:21

2 Answers 2

1

@Fred - I think that you are simply using the wrong URL. When I replace the link with http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vixcurrent.csv, your script works.

I found this URL on the page your script originally pointed to.

Sign up to request clarification or add additional context in comments.

Comments

1

This is not really an answer, but just to notify the link from CBOE is not valid at this moment (starting from 2020-DEC-07 to today, 2020-DEC-23), not sure if the url will be back yet. There is a similar format from datahub.io, but it is not up-to-date, free data from CHRIS via Quandl is also not up-to-date. I have yet to find an official notice from CBOE stating this url will no longer be supported. Posted a similar question/finding in quantconnect.

https://www.quantconnect.com/forum/discussion/7673/problem-pulling-cboe-vix-data-on-live-trading/p1

import pandas as pd
url='http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vixcurrent.csv'
df = pd.read_csv(url)
print(df.shape)
/usr/lib/python3.6/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    648 class HTTPDefaultErrorHandler(BaseHandler):
    649     def http_error_default(self, req, fp, code, msg, hdrs):
--> 650         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    651 
    652 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 404: NOT FOUND

above url from CBOE seems to no longer be working.

Out-dated-data can be obtained from datahub.io & quandl:

url = 'https://datahub.io/zelima1/finance-vix/r/vix-daily.csv'
df = pd.read_csv(url)
print(df.shape)
print(df.Date)
(3488, 5)
(3488, 5)
0       2004-01-02
1       2004-01-05
2       2004-01-06
3       2004-01-07
4       2004-01-08
           ...    
3483    2017-11-01
3484    2017-11-02
3485    2017-11-03
3486    2017-11-06
3487    2017-11-07
Name: Date, Length: 3488, dtype: object

Quandl CHRIS VIX:

https://www.quandl.com/data/CHRIS/CBOE_VX1-S-P-500-Volatility-Index-VIX-Futures-Continuous-Contract-1-VX1-Front-Month

1 Comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.