Pandas read_excel

Question

I struggled for a few hours how to read an excel file with pd.read_excel where the path is a website address. I figured out that the link doesn't go directly to the file but just triggers downloading. Is there any easy way to solve it?

Part of code:

link_energy = 'http://unstats.un.org/unsd/environment/excel_file_tables/2013/Energy%20Indicators.xls'
df_energy = pd.read_excel(link_energy)

Error message:

XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\n\n\n<!DOC'

Probably it's not a problem of pandas but my lack of skills how do do it.

ok, I added full url but, comes form a platform coursera. Does't it explain anything? — shep4rd
– shep4rd, Commented Feb 14, 2018 at 18:40
I saw it and checked it @jp_data_analysis, it's not the case here — shep4rd
– shep4rd, Commented Feb 14, 2018 at 18:41

karol · Accepted Answer · 2018-02-14 20:16:38Z

1

For me works everything as expected in the following code:

import pandas as pd
link_energy = 'http://unstats.un.org/unsd/environment/excel_file_tables/2013/Energy%20Indicators.xls'
df_energy = pd.read_excel(link_energy)
df_energy

without errors on the following env:

The version of the notebook server is: 5.2.2 The server is running on this version of Python:

Python 3.6.3 | packaged by conda-forge | (default, Nov 4 2017, 10:10:56) [GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]

Current Kernel Information:

Python 3.6.3 | packaged by conda-forge | (default, Nov 4 2017, 10:10:56) Type 'copyright', 'credits' or 'license' for more information IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.

edited Feb 14, 2018 at 20:16

answered Feb 14, 2018 at 20:00

karol

4401 gold badge6 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Marco Spinaci Over a year ago

I can confirm, after updating pandas to the last version (don't know if it was needed) and installing xlrd (no need to import it), pd.read_excel works as expected.

vanishka · Accepted Answer · 2018-02-14 18:56:38Z

0

However I am not having access to your url posted.

but pd.read_excel won't work and you need to use pd.read_csv

import pandas as pd

df = pd.read_csv('https://cib.societegenerale.com/fileadmin/indices_feeds/CTA_Historical.xls')

Now you need to see the excel file what it contains what is the separator used, if there are any other values in any columns then it needs to be skipped in order to load and read useful data.

answered Feb 14, 2018 at 18:56

vanishka

1752 silver badges12 bronze badges

1 Comment

shep4rd Over a year ago

I added url which you can access. As I understand your answer would solve the problem if my case is really a .csv file

Collectives™ on Stack Overflow

Pandas read_excel

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related