0

I'm new to python, so i have this XML file which i can open pretty easily using Excel, this is XML file

andi want to convert it to .xlsx or any compatible format to be able to use openpyxl module to it so that i can read it easily, is there any way to do this on python? Any advice would be appreciated thankyou.

8
  • not Python but command line: stackoverflow.com/questions/30349542/… Commented Sep 29, 2020 at 11:26
  • xlsx is XML - it's a ZIP package containing XML documents. Where did this XML file come from? Why not modify the code that generated this to create a proper xlsx file? Commented Sep 29, 2020 at 11:42
  • @TinNguyen how is this related to this question? The linked question is about using LibreOffice to convert documents. This question asks how to convert what looks like a 2003 XML Excel file into the current Excel format Commented Sep 29, 2020 at 11:44
  • @PanagiotisKanavos LibreOffice Calc is an open source alternative of Microsoft Excel. You can open Excel XML and Excel XLSX with it and save it in different file formats. Furthermore it has a command line for file conversion. I would link Microsoft Excel directly but from my quick research they don't offer a CLI for that. Commented Sep 29, 2020 at 11:48
  • SpreadsheetML was used only briefly between 2003 and 2006 as a stop-gap due to patent problems. It was replaced by xlsx in 2006. Very few if any libraries support it directly. Unless you have to read 15-year old files, it's better to use a library like xlsxwriter to write proper xlsx files Commented Sep 29, 2020 at 11:50

1 Answer 1

0

I had the same problem. This answer helped me Attempting to Parse an XLS (XML) File Using Python

You may save your Excel styled XML as xlsx with Workbook.SaveAs method using win32com (only for Windows users) and read in with pandas.read_excel

import win32com.client
import pandas as pd

original_file = "Your_downloaded_file.xml"
output = "Your_converted_file.xlsx"

xlApp = win32com.client.Dispatch("Excel.Application")
xlWbk = xlApp.Workbooks.Open(original_file)
xlWbk.SaveAs(output, 51)
xlWbk.Close(True)
xlApp.Quit()

output_df = pd.read_excel(output)
print(output_df.columns.ravel())

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.