21

I have an np.array with over 330,000 rows. I simply try to take the average of it and it returns NaN. Even if I try to filter out any potential NaN values in my array (there shouldn't be any anyways), average returns NaN. Am I doing something totally wacky?

My code is here:

average(ngma_heat_daily)
Out[70]: nan

average(ngma_heat_daily[ngma_heat_daily != nan])
Out[71]: nan

2 Answers 2

25

try this:

>>> np.nanmean(ngma_heat_daily)

This function drops NaN values from your array before taking the mean.

Edit: the reason that average(ngma_heat_daily[ngma_heat_daily != nan]) doesn't work is because of this:

>>> np.nan == np.nan
False

according to the IEEE floating-point standard, NaN is not equal to itself! You could do this instead to implement the same idea:

>>> average(ngma_heat_daily[~np.isnan(ngma_heat_daily)])

np.isnan, np.isinf, and similar functions are very useful for this type of data masking.

Sign up to request clarification or add additional context in comments.

1 Comment

That worked! Could you explain what the issue is with just taking the average? I have another array of 16k entries (produced from the same original source) that worked fine with the average() method.
3

Also, there is a function named nanmedian which ignores NaN values. Signature of that function is: numpy.nanmedian(a, axis=None, out=None, overwrite_input=False, keepdims=<no value>)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.