0

I ran the following script (https://github.com/FXCMAPI/FXCMTickData/blob/master/TickData34.py) and added the following lines at the end to download the files:

    output_folder = '/Users/me/Documents/data/forex/'
    target_folder = os.path.join(output_folder, symbol, year)
    os.makedirs(target_folder, exist_ok=True)
    with open(os.path.join(target_folder, str(i) + '.csv'), 'wb') as outfile:
            outfile.write(data)

Then, I tried opening the file using pandas as follows:

x = pd.read_csv('/Users/me/Documents/data/forex/EURUSD/2015/29.csv')

However, this is what I got:

    In [3]: x.info()
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 2415632 entries, 0 to 2415631
    Data columns (total 3 columns):
    D             float64
    Unnamed: 1    float64
    Unnamed: 2    float64
    dtypes: float64(3)
    memory usage: 55.3 MB

    In [4]: x.dropna()
    Out[4]: 
    Empty DataFrame
    Columns: [D, Unnamed: 1, Unnamed: 2]
    Index: []

Why is the dataframe empty?

If I open the file on TextEdit, the first few lines actually look like this:

DateTime,Bid,Ask

07/19/2015 21:00:15.469,1.083,1.08332

07/19/2015 21:00:16.949,1.08311,1.08332

07/19/2015 21:00:16.955,1.08311,1.08338
1
  • The dataframe is not empty until you drop the nulls. You need to use parse_dates = 'DateTime' Commented Oct 13, 2017 at 3:20

2 Answers 2

1

Apparently, every character in your data is followed by the null character \x00. Get rid of them, and things will work:

outfile.write(data.replace(b'\x00',b''))
Sign up to request clarification or add additional context in comments.

Comments

0

Thank you for providing a very concrete and reproducible problem.

I pasted your code and run them in windows and it indeed just read in 55MB of null values.

But I think it is a problem of pandas not parsing the csv file correctly, not that it cannot open the csv file.

However, after I tried all the encoding listed in this answer, it simply didn't yield, so might be something wrong with the file as well.

How I eventually made it work is by opening it in excel and save as a different file, then pandas can parse it correctly.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.