Python iterating over excel files in a folder

Question

I am interested in getting this script to open an excel file, and save it again as a .csv or .txt file. I'm pretty sure the problem with this is the iteration - I haven't coded it correctly to iterate properly over the contents of the folder. I am new to Python, and I managed to get this code to sucessfully print a copy of the contents of the items in the folder by the commented out part. Can someone please advise what needs to be fixed?

My error is: raise XLRDError('Unsupported format, or corrupt file: ' + msg)

from xlrd import open_workbook
import csv
import glob
import os
import openpyxl

cwd= os.getcwd()
print (cwd)

FileList = glob.glob('*.xlsx')
#print(FileList)

for i in FileList:
    rb = open_workbook(i)
    wb = copy(rb)
    wb.save('new_document.csv')

one quick way is to use pandas, import pandas as pd; pd.read_excel("<file_path>").to_csv("<ouptut_path>", index=False) — sushanth
– sushanth, Commented Jun 24, 2020 at 11:24
I haven't been able to get this to work, in part because I want to iterate over the contents of a folder. — Unicorn_tech
– Unicorn_tech, Commented Jun 24, 2020 at 11:26
It looks like it should work, can you print(list(FileList)) — Umar.H
– Umar.H, Commented Jun 24, 2020 at 11:28
This is when I tried your code above: ``` File "C:\Users\gittel\AppData\Local\Continuum\anaconda3\lib\site-packages\xlrd\book.py", line 1272, in bof_error raise XLRDError('Unsupported format, or corrupt file: ' + msg) XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'Country,' ``` — Unicorn_tech
– Unicorn_tech, Commented Jun 24, 2020 at 11:30
For print(list(FileList)): ``` runfile('F:/Design/Projects/Website/OPtimized Photos/Agent Area/ip Info reports/June 22 reports/trying again.py', wdir='F:/Design/Projects/Website/OPtimized Photos/Agent Area/ip Info reports/June 22 reports') ['Algeria.xlsx', 'Angola.xlsx', 'Argentina.xlsx', 'Australia.xlsx', 'Bosnia Herzegovina.xlsx', 'Brazil.xlsx', 'Chile.xlsx', 'China.xlsx', 'Colombia.xlsx', 'Ecuador.xlsx', 'Egypt.xlsx', 'EU.xlsx', 'Hong Kong.xlsx', 'India.xlsx', 'Italy.xlsx', 'Kenya.xlsx', 'Lebanon.xlsx', 'Mexico.xlsx', 'Morocco.xlsx''] ``` — Unicorn_tech
– Unicorn_tech, Commented Jun 24, 2020 at 11:31

David Duarte · Accepted Answer · 2020-06-24 12:35:39Z

1

I would just use:

import pandas as pd
import glob
import os

file_list = glob.glob('*.xlsx')

for file in file_list:
    filename = os.path.split(file, )[1]
    pd.read_excel(file).to_csv(filename.replace('xlsx', 'csv'), index=False)

answered Jun 24, 2020 at 12:35

David Duarte

6644 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Leo Qi · Accepted Answer · 2020-06-24 13:17:43Z

1

It appears that your error is related to the excel files, not because of your code.

Check that your files aren't also open in Excel at the same time.
Check that your files aren't encrypted.
Check that your version of xlrd supports the files you are reading

In the above order. Any of the above could have caused your error.

answered Jun 24, 2020 at 13:17

Leo Qi

5675 silver badges13 bronze badges

Collectives™ on Stack Overflow

Python iterating over excel files in a folder

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related