Perform numpy operation with None/NaN in array

Question

Is there any way to make this work? Where the array I'm working on consist of None, which means to ignore that value in the processing. For example, I would like to normalize this array:

output = np.array([[1,2,None,4,5],[None,7,8,9,10]])
mu = np.mean(output, axis=(0,1), keepdims=True)
sd = np.std(output, axis=(0,1), keepdims=True)
normalized_output = (output - mu)/sd

Expected outcome:

array([[-1.5666989 , -1.21854359, None, -0.52223297, -0.17407766],
       [ None,  0.52223297,  0.87038828,  1.21854359,  1.5666989 ]])

Edit: As suggested, it is better to use NaN instead of None. How to get this to work with NaN:

output = np.array([[1,2,np.NAN,4,5],[np.NAN,7,8,9,10]])
mu = np.mean(output, axis=(0,1), keepdims=True)
sd = np.std(output, axis=(0,1), keepdims=True)
normalized_output = (output - mu)/sd
print(normalized_output)
# array([[nan, nan, nan, nan, nan],
#        [nan, nan, nan, nan, nan]])

If you have None in your vector, this is a very bad sign: it means the values in the array are of type object and so that all related computations are not optimized. Consider using NaN values that are native ones. — Jérôme Richard
– Jérôme Richard, Commented Apr 25, 2021 at 15:00
Thanks for your input. I didnt know None is bad for vector. I can use where to change it to NaN. Updated the question with your suggestion. — Jingles
– Jingles, Commented Apr 25, 2021 at 15:22
If you want to keep your values integers, use a masked array instead of NaN. I regard NaN as the result of a bad computation (0/0 for example), while a masked value indicates the absence of the value: two different things. NaN is often used for both, but that can lead to confusion. — 9769953
– 9769953, Commented Apr 25, 2021 at 15:31
NaNs are also taken into account when calculating, for example, a mean value. There are special nanmean functions, but here, I think a masked array is more appropriate. — 9769953
– 9769953, Commented Apr 25, 2021 at 15:32
Does this answer your question? NumPy: calculate averages with NaNs removed — 9769953
– 9769953, Commented Apr 25, 2021 at 15:38

Oli · Accepted Answer · 2021-04-25 15:32:39Z

1

You can do calculation that skip over certain values by using numpy masked arrays.

A function already exists to create a masked array that masks NaN values: ma.masked_invalid.

It can be used like so:

import numpy as np
from numpy import ma


output = ma.masked_invalid([[1,2,np.NAN,4,5],[np.NAN,7,8,9,10]])

mu = np.mean(output, axis=(0,1), keepdims=True)
sd = np.std(output, axis=(0,1), keepdims=True)
normalized_output = (output - mu)/sd
print(normalized_output)

Output (-- represents an invalid value):

[[-1.5461980716652028 -1.2206826881567392 -- -0.5696519211398116
  -0.24413653763134782]
 [-- 0.40689422938557973 0.7324096128940435 1.0579249964025073
  1.3834403799109711]]

answered Apr 25, 2021 at 15:32

Oli

2,6221 gold badge13 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Zalak Bhalani · Accepted Answer · 2021-04-25 15:34:56Z

0

You can use np.nanstd and np.nanmean function instead of np.std and np.mean

output = np.array([[1,2,np.nan,4,5],[np.nan,7,8,9,10]])
mu = np.nanmean(output, axis=(0,1), keepdims=True)
sd = np.nanstd(output, axis=(0,1), keepdims=True)
normalized_output = (output - mu)/sd

you will get output like this

array([[-1.54619807, -1.22068269,         nan, -0.56965192, -0.24413654],
      [        nan,  0.40689423,  0.73240961,  1.057925  ,  1.38344038]])

It is different from your desired output because np.nanstd ignore Nan values present in array so you have 8 elements instead of 10.

answered Apr 25, 2021 at 15:34

Zalak Bhalani

1,1449 silver badges17 bronze badges

1 Comment

9769953 Over a year ago

Note that this changes the dtype of output from int64 to float64.

Collectives™ on Stack Overflow

Perform numpy operation with None/NaN in array

2 Answers 2

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related