1

I have a list of numpy arrays of size 5. All of the arrays inside the list are of different lengths. I need a single array that holds the means of the elements and a single array that holds the standard deviations. Example:

[10, 10, 10, 10]
[ 8,  8,  8,  8, 8]
[12, 12, 12]

I want:

[10, 10, 10, 9, 8] and
[1.3, 1.3, 1.3, 1.1, 0]

(I made up the std devs)

Thanks in advance!

1 Answer 1

1

One way would be to fill the empty places with NaNs, resulting in a 2D array and then use nan specific NumPy arithmetic tools, such as nanmean (compute mean skipping the NaNs) etc. along the appropriate axis, like so -

In [5]: import itertools

# a is input list of lists/arrays
In [48]: ar = np.array(list(itertools.zip_longest(*a, fillvalue=np.nan)))

In [49]: np.nanmean(ar,axis=1)
Out[49]: array([10., 10., 10.,  9.,  8.])

In [50]: np.nanstd(ar,axis=1)
Out[50]: array([1.63299316, 1.63299316, 1.63299316, 1.        , 0.        ])

Another way is to convert to a pandas dataframe such that empty places are filled with NaNs and then use dataframe methods that account for the NaNs natively, like so -

In [16]: import pandas as pd

In [17]: df = pd.DataFrame(a)

In [18]: df.mean(0).values
Out[18]: array([10., 10., 10.,  9.,  8.])

In [19]: df.std(0,ddof=0).values
Out[19]: array([1.63299316, 1.63299316, 1.63299316, 1.        , 0.        ])
Sign up to request clarification or add additional context in comments.

1 Comment

Awesome - thanks! I didn't even think about NaN's not being included in mean/std dev.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.