1

I have generated an Excel file, with the following columns (cf. minimal code below):

  • ID (col. A): set via hardcoded values (i.e. not via Excel formula)
  • Manually-set resolved (col. B): a column that should either be left blank or set TRUE by the user
  • Resolved (col. D): a column indicating whether "something" (represented by the row) has been "resolved"; it should be TRUE iff:
    • Manually-set resolved is TRUE
    • and/or Any in ID resolved (see directly below) is TRUE
  • Any in ID resolved (col. C): a column indicating whether any of the rows sharing the ID value have Resolved equal TRUE

Formulas are OR-based for column Resolved, MAXIFS-based for column Any in ID resolved (cf. minimal code below).

Example formulas (note: German locale for installed Excel - does not matter; English formula names are recognized):

  • Resolved cell: D2=ODER(B2=WAHR;C2=WAHR)
  • Any in ID resolved cell: C2=@MAXIFS($D$2:$D$11;$A$2:$A$11;A2)

Issue:

I would expect that at the beginning both the columns Resolved and Any in ID resolved are all FALSE and that values are auto-set by the formulas to TRUE as soon as the user sets the column Manually-set resolved to TRUE.

Unfortunately, that is not what happens - rather, when opening the generated Excel file, the cells all display #NAME? (cf. screenshot below), as if there was an issue with the formulas - but I cannot figure out what it is. And sometimes, when fiddling around with the generated Excel, 0 is displayed (which can be seen as representing FALSE) - but then column Resolved does not auto-update when Manually-set resolved is set to TRUE.

Screenshot of issue

Notes:

  • Excel 365 is used - version 2502, so MAXIFS can be used (is available)
  • Excel also warns about circularity - and yes, the columns Resolved and Any in ID resolved refer to each other - but it should NOT be an issue, since Manually-set resolved should break any such circularity issues via first setting Resolved to TRUE and based on that Any in ID resolved to TRUE.

Questions:

  • Why do the Resolved and Any in ID resolved columns display #NAME? rather than FALSE?
  • Why does the Resolved column not auto-update when the Manually-set resolved column is set to TRUE?

Minimal example code to generate a test Excel file:

from openpyxl import Workbook

wb = Workbook()
ws = wb.active
ws.title = "Example"

# Column headers
ws['A1'] = "ID"
ws['B1'] = "MANUALLY_SET_RESOLVED"
ws['C1'] = "ANY_IN_ID_RESOLVED"
ws['D1'] = "RESOLVED"

num_rows = 10

# Populate example data for ID (e.g. 3 groups: 1,2,3 repeated)
for row in range(2, num_rows + 2):
    ws[f"A{row}"] = (row - 2) % 3 + 1  # Cycle through 1,2,3

    # MANUALLY_SET_RESOLVED starts empty (FALSE/blank)
    ws[f"B{row}"] = None

    # ANY_IN_ID_RESOLVED: MAXIFS over RESOLVED column for group
    ws[f"C{row}"].value = f'=MAXIFS($D$2:$D${num_rows + 1},$A$2:$A${num_rows + 1},A{row})'

    # RESOLVED: OR of MANUALLY_SET_RESOLVED and ANY_IN_ID_RESOLVED
    ws[f"D{row}"].value = f'=OR(B{row}=TRUE,C{row}=TRUE)'

wb.save("example_file.xlsx")

Update for @MGonet:

The MAXIFS appears to have issues somehow even with numbers rather than booleans - see this screenshot:

enter image description here

But when using MAXWENNS instead, the formula returns a result - not the correct one though, since it will always return 0 for any booleans (even TRUE). This indicates that there might be a localization issue, and that booleans might need to be converted to numbers via -- inside the formula:

enter image description here

Also, in such a case, without overwriting the Resolved column manually with booleans, MAXWENNS returns 0, Excel displays circularity warnings and also visualizes that via a blue-red double arrow (cf. screenshot), and setting Manually-set resolved to TRUE does NOT update Resolved - which it should, however.

enter image description here

Then, only when double-clicking into a Resolved cell (but I want auto-re-calculation) and pressing Enter, is the value set - to 0 - which is incorrect, since it should be TRUE. I do actually not understand how =OR(TRUE;0) returns 0 rather than TRUE - that makes no sense to me. Apart from that returning 0 for =NOT(0) is also plain wrong:

enter image description here enter image description here

2
  • 1
    Instead of sharing a code to produce a file containing the data and formulas related to your problem, post these as code and markdown table, so we and future visitors with a similar problem can recognize their problem from the question. Commented Oct 11 at 7:11
  • @P.b: I added example formulas to my post - but I think having the actual code to reproduce creating the file is useful; of course I can also upload the generated Excel file to some cloud location and link it here if that helps. Commented Oct 11 at 13:10

2 Answers 2

1

Answer to the question about the #NAME? error:

The Excel function MAXIFS was introduced after publishing the Office Open XML file system in 2007. Thus not all spreadsheet applications will support this function. To mark this, such functions are prefixed by _xlfn in file storage.

In your case the Excel version supports this function MAXIFS but it is not marked by _xlfn prefix in file storage because openpyxl directly writes to file storage. Excel marks this problem with @ in formula string in GUI and your German GUI will not translate it to MAXWENNS. That's why the #NAME? error in your German Excel while trying to evaluate.

Since OpenPyXL directly writes into file storage that prefix must be set in formula string. Wrong is formula =...MAXIFS(.... Correct is formula = ..._xlfn.MAXIFS(....

So in your code example

...
ws[f"C{row}"].value = f'=_xlfn.MAXIFS($D$2:$D${num_rows + 1},$A$2:$A${num_rows + 1},A{row})'
...

will avoid the #NAME? errors.

Sign up to request clarification or add additional context in comments.

1 Comment

Amazing - you are absolutely correct. Now the main issue is that MAXIFS still does not process booleans correctly, but that is independent of the #NAME? issue.
1

The MAXIFS function looks for the maximum among the numbers and ignores the Boolean values.
I suggest that in column D you should put the formula:
=--B{row}
to convert Boolean values into numbers.
Then the result (in the form of numbers 0 or 1) will be in column C.

Added in reply to comments

I still don't understand why you use formulas with circular references. In addition, I see that Python has a problem with the relatively new MAXIFS function. Therefore, you can use the old SUMIFS function instead. In my opinion, it is necessary to convert the Boolean value in column B to a number. This would be the new column C.
If you just need information if any row with a given ID has been selected, then in column D you can use the formula:
=SUMIFS($C$2:$C$11,$A$2:$A$11,A2)>0
If you need separately information about whether any other row with the same ID has been selected, then in column D you can use:
=SUMIFS($C$2:$C$11,$A$2:$A$11,A2)>C2
and in column E, combine columns B and D:
=OR(B2,D2)

Or yet another variant with the COUNTIFS function. In this case, you don't need to convert Boolean values into numbers in a separate column.
If you just need information if any row with a given ID has been selected, then in column C you can use the formula:
=COUNTIFS($B$2:$B$11,TRUE,$A$2:$A$11,A2)>0
If you need separately information about whether any other row with the same ID has been selected, then in column C you can use:
=COUNTIFS($B$2:$B$11,TRUE,$A$2:$A$11,A2)>--B2
and in column D combine columns B and C:
=OR(B2,C2)

7 Comments

I thought that MAXIFS would be able to process booleans (internally converting them to 0 and 1, respectively) - but, surprisingly, if I manually overwrite the cell values of column Resolved with 0, the MAXIFS still does not work, displaying #NAME?. I added a short section at the end of my post where you'll see a screenshot.
However, note that MAXWENNS(), i.e. the German variant appears to work - and NOT only with numbers, but also with booleans - see also post update.
Also very weird behavior that cell values for Resolved are re-calculated only upon selecting the cell formula and hitting Enter - and that a wrong result is displayed.
I use the Polish version of Excel, but without Python, and in my version the MAXIFS function definitely ignores Boolean values. These strange results you observe are a consequence of unresolved circular references in the worksheet. In these cases, formulas often return 0 regardless of the actual values.
You are correct - MAXIFS does NOT work with booleans, in the sense that it always returns 0 (even though it should sometimes return True or 1); I just checked this with some cells entered in a blank sheet - and updated my post accordingly. But this can be avoided via the -- operator as you mentioned. The circular reference issue is the one I actually don't understand.
I don't understand why you write your formulas with circular references when they can be avoided. Of course, you can agree to iterative calculations, but in this case it will not help when it comes to recalculating formulas, because in the case of cyclic references after editing, the formula recalculation starts from 0, and without editing – from the last calculated value.
How to avoid the circularity? Of course, we could exclude the cell itself from "Any in ID resolved" (basically sth. like "Any other in ID resolved"). ChatGPT had suggested to apply a MAX(FILTER(()) variant for that, e.g. =MAX(FILTER($D$2:$D$10, ($A$2:$A$10=$A2)*(ROW($A$2:$A$10)<>ROW(A2)))), but when generating the file from python, Excel removed the formulas. According to ChatGPT, "While Excel 365 and 2021+ support FILTER, .xlsx does not officially support “dynamic array/spilled” formulas as a flag in the file structure (yet), and openpyxl cannot set the "spilled array formula" pro

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.