1

I am interested in getting this script to open an excel file, and save it again as a .csv or .txt file. I'm pretty sure the problem with this is the iteration - I haven't coded it correctly to iterate properly over the contents of the folder. I am new to Python, and I managed to get this code to sucessfully print a copy of the contents of the items in the folder by the commented out part. Can someone please advise what needs to be fixed?

My error is: raise XLRDError('Unsupported format, or corrupt file: ' + msg)

from xlrd import open_workbook
import csv
import glob
import os
import openpyxl

cwd= os.getcwd()
print (cwd)

FileList = glob.glob('*.xlsx')
#print(FileList)

for i in FileList:
    rb = open_workbook(i)
    wb = copy(rb)
    wb.save('new_document.csv')

9
  • 1
    one quick way is to use pandas, import pandas as pd; pd.read_excel("<file_path>").to_csv("<ouptut_path>", index=False) Commented Jun 24, 2020 at 11:24
  • I haven't been able to get this to work, in part because I want to iterate over the contents of a folder. Commented Jun 24, 2020 at 11:26
  • 1
    It looks like it should work, can you print(list(FileList)) Commented Jun 24, 2020 at 11:28
  • This is when I tried your code above: ``` File "C:\Users\gittel\AppData\Local\Continuum\anaconda3\lib\site-packages\xlrd\book.py", line 1272, in bof_error raise XLRDError('Unsupported format, or corrupt file: ' + msg) XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'Country,' ``` Commented Jun 24, 2020 at 11:30
  • For print(list(FileList)): ``` runfile('F:/Design/Projects/Website/OPtimized Photos/Agent Area/ip Info reports/June 22 reports/trying again.py', wdir='F:/Design/Projects/Website/OPtimized Photos/Agent Area/ip Info reports/June 22 reports') ['Algeria.xlsx', 'Angola.xlsx', 'Argentina.xlsx', 'Australia.xlsx', 'Bosnia Herzegovina.xlsx', 'Brazil.xlsx', 'Chile.xlsx', 'China.xlsx', 'Colombia.xlsx', 'Ecuador.xlsx', 'Egypt.xlsx', 'EU.xlsx', 'Hong Kong.xlsx', 'India.xlsx', 'Italy.xlsx', 'Kenya.xlsx', 'Lebanon.xlsx', 'Mexico.xlsx', 'Morocco.xlsx''] ``` Commented Jun 24, 2020 at 11:31

2 Answers 2

1

I would just use:

import pandas as pd
import glob
import os

file_list = glob.glob('*.xlsx')

for file in file_list:
    filename = os.path.split(file, )[1]
    pd.read_excel(file).to_csv(filename.replace('xlsx', 'csv'), index=False)
Sign up to request clarification or add additional context in comments.

Comments

1

It appears that your error is related to the excel files, not because of your code.

  • Check that your files aren't also open in Excel at the same time.
  • Check that your files aren't encrypted.
  • Check that your version of xlrd supports the files you are reading

In the above order. Any of the above could have caused your error.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.