6

I want to convert all the excel document(.xls) sheets into csv, If excel document has one sheet only then I am converting like as follow-

   wb = open_workbook(path1)
    sh = wb.sheet_by_name('Sheet1')
    csv_file = open(path2, 'w')
    wr = csv.writer(csv_file, quoting=csv.QUOTE_ALL)
    for rownum in range(sh.nrows):
        wr.writerow(sh.row_values(rownum))
    csv_file.close()

If my excel(.xls) document have more than one sheet i.e.('Sheet1', 'Sheet2', 'Sheet3', 'Sheet4') than how to convert all sheets into csv.

Any help would be appreciated.

1
  • for newbies like me, the solution needs "pip" to be installed >> sudo apt install pip. after this "pip install pandas", and after that "pip install openpyxl", then you are ok following the code written in answers. Commented Feb 26, 2022 at 7:30

7 Answers 7

9

My understanding is that you're trying to get one CSV file for each sheet.

You can obtain that by executing the following:

excel_file = 'data/excel_file.xlsx'
all_sheets = pd.read_excel(excel_file, sheet_name=None)
sheets = all_sheets.keys()

for sheet_name in sheets:
    sheet = pd.read_excel(excel_file, sheet_name=sheet_name)
    sheet.to_csv("data/%s.csv" % sheet_name, index=False)

If you actually want to concatenate all sheets to one CSV, they all need to have the same column names. You can concatenate all your CSV files into one by executing the following:

import glob
import os
all_files = glob.glob(os.path.join("data", "*.csv"))
df_from_each_file = (pd.read_csv(f, sep=',') for f in all_files)
df_merged = pd.concat(df_from_each_file, ignore_index=True)
df_merged.to_csv( "data/merged.csv")

Source for the second snippet

Sign up to request clarification or add additional context in comments.

1 Comment

You are reading the excel file multiple times on loop, which has severe overhead.
8

I am using python3.x in Anaconda environment and In my case file name is 'INDIA-WMS.xlsx' having 40 different sheets below code will create 40 different csv files named as sheet name of excel file, as 'key.csv'. Hope this will help your issue.

    import pandas as pd
    df = pd.read_excel('INDIA-WMS.xlsx', sheet_name=None)  
    for key in df.keys(): 
        df[key].to_csv('%s.csv' %key)

For example if you have different sheets like 'Sheet1', 'Sheet2', 'Sheet3' etc. then above code will create different csv file as 'Sheet1.csv', 'Sheet2.csv', 'Sheet3.csv'. Here 'key' is the sheet name of your excel workbook. If you want to use data content inside sheets you can use the for loop as for key, value in df.items():

2 Comments

Thanks @Ashu007 but I get "TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed" when trying the loop.
I had to change df.items() to df.keys() as per answer below by @sclark
4

wb.sheet_names() to get all the sheet names, and then loop it and dynamically put the name in the sheet_name

Comments

3

I followed the solution by Ashu007, but on Python3.9 and Pandas 1.2.0 I needed to change df.items() to df.keys() like so:

import pandas as pd
df = pd.read_excel('file_name.xlsx', sheet_name=None)  
for key in df.keys(): 
    df[key].to_csv('{}.csv'.format(key))

1 Comment

Thanks, I was wondering why Ashu007's code was not working. Thanks for the update.
2

You can try the below code this worked for me.

import pandas as pd
data = pd.read_excel('sample1.xlsx', sheet_name=None)

# loop through the dictionary and save csv
for sheet_name, df in data.items():
df.to_csv(f'{sheet_name}.csv')

Comments

1

I ran into a similar issue of trying to list multiple excel sheets within an excel file into one excel sheet before converting to .csv. Please note that the term 'PC' and 'PC_City.xlsx' are just labels of the precipitation data I am working with.

This is what worked for me:

import pandas as pd

excel_file = r'C:\Users\yourpath\PC_City.xlsx'
df = pd.read_excel(excel_file, sheetname=None)
xlsx = pd.ExcelFile(excel_file)
PC_sheets = []
for sheet in xlsx.sheet_names:
    PC_sheets.append(xlsx.parse(sheet))
    PC = pd.concat(PC_sheets)

PC.to_csv('PC_City.csv', encoding='utf-8', index=False)   

I am new to programming, so there may be a better way to go about this. Hope this helps.

Comments

0
import pandas as pd

df = pd.read_excel('data.xlsx', sheet_name=None)  
for key in df: 
   df[key].to_csv('%s.csv' %key)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.