I have a bunch of purchase orders in .html formats that I need to extract data and put in one simple excel sheet. While I could use beutifulsoup to do it I would rather just use excel's in built converter which already does a much better job. Then just work with excel files directly. Is there a way to use python to open html documents, then save it again in .xlsx. I tried using openpyxl but it does not take html files.