1

I have some Excel files with measurements taken from National Instruments' LabView. I'm trying to use Pandas to be able to edit the data but when using read_excel on those Excel files I get the error TypeError: expected <class 'openpyxl.styles.fills.Fill'>.

The strange part is that if I open the file by hand and click save, without changing anything, read_excel is suddenly able to open the files. The amount of files is unfortunately too much for me to be able to resave by hand. Does anyone have any idea how to solve this problem? I've searched for this problem a lot and found nothing yet. Thanks!

Edit:

The code I'm using is the following.

import pandas as pd
import os

fname = 'C' # All the file I want to open start with C
fextension = '.xlsx'
directory = 'D:/TEST_Raw'

df_list =  []
for filename in os.listdir(directory):
    if fname in filename and filename.endswith(fextension):
         df1 = pd.read_excel(directory + '/' + filename, header = 0, index_col = None, engine = 'openpyxl')

An example file is in this link. If I use this file the program will not run and give the error, but if I open and save the Excel it will run.

4
  • 1
    Can you please the code you are using along with one of the excel files you are using? Commented Aug 3, 2021 at 19:28
  • I have edited the post with the file and the code :) Commented Aug 3, 2021 at 19:59
  • Thanks! yes I've been able to reproduce the error. I have not found a solution myself, but have come across multiple people having the same problem. As a first guess, it seems that it's a problem with how LabView is creating the xlsx. I might suggest contacting them first. I found someone exclaiming they might be using something like Apache POI to create the files with an invalid style. Commented Aug 3, 2021 at 20:36
  • If I come across something, I'll let you know. Sorry! Commented Aug 3, 2021 at 20:36

1 Answer 1

1

Seems like the source file is corrupt to the point that a standard method of opening the file is not possible (e.g., pd.read_excel() or pd.ExcelFile(). If there are too many files to open manually and save...Try a non-standard way of opening the file.

One idea is using the code from: https://blog.adimian.com/2018/09/04/fast-xlsx-parsing-with-python/ (there may be better ways out there).

I tested the sample file using the code from blog.adimian.com (see the Full Code section right at the bottom of the page) and it seems to work. However, the column names are missing and need to be set manually. If the column names are all the same you could loop this for all the files.

Example output:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.