2

What is most efficient way to find the mode per row in a multi-dimensional array of the non-zero elements?

For example:

[
 [0.  0.4 0.6 0.  0.6 0.  0.6 0.  0.  0.6 0.  0.6 0.6 0.6 0.  0.  0.  0.6
     0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.6 0.  0.  0.6 0.6 0.6 0.  0.  0.6
     0.6 0.6 0.  0.5 0.6 0.6 0.  0.  0.6 0.  0.6 0.  0.  0.6],
 [0.  0.1 0.2 0.1 0.  0.1 0.1 0.1 0.  0.1 0.  0.  0.  0.1 0.1 0.  0.1 0.1
 0.  0.1 0.1 0.1 0.  0.1 0.1 0.1 0.  0.1 0.2 0.  0.1 0.1 0.  0.1 0.1 0.1
 0.  0.2 0.1 0.  0.1 0.  0.1 0.1 0.  0.1 0.  0.1 0.  0.1]
]

The mode of the above is [0, 0.1], but ideally we want to return [0.6, 0.1].

3
  • Possible duplicate of Most efficient way to find mode in numpy array Commented Feb 14, 2019 at 20:45
  • 1
    While Nick's solution works, this would be done in a much simpler way if you were using pandas instead of numpy. Commented Feb 14, 2019 at 21:29
  • If you're open to using pandas as what @Griffin suggested, I'd be more than happy to write an answer as well... Unless Griffin wants to do it first! Commented Feb 14, 2019 at 21:55

3 Answers 3

0

You would use the same method as this question (mentioned in the comments by @yatu), but instead make a call to the numpy.nonzero() method.

To get just the non-zero elements, we can just call the nonzero method, which will return the indices of the non-zero elements. We can do this using this command, if a is a numpy array:

a[nonzero(a)]

Example finding the mode (building off code from the other answer):

import numpy as np
from scipy import stats

a = np.array([
    [1, 0, 4, 2, 2, 7],
    [5, 2, 0, 1, 4, 1],
    [3, 3, 2, 0, 1, 1]]
)

def nonzero_mode(arr):
    return stats.mode(arr[np.nonzero(arr)]).mode

m = map(nonzero_mode, a)
print(m)

If you wanted to get the mode of each row, just use a loop through the array:

for row in a:
   print(nonzero_mode(row))
Sign up to request clarification or add additional context in comments.

3 Comments

Have you tested this?
I get ModeResult(mode=array([1]), count=array([5]))
This applies the mode over the entire array of non-zero values, not each row individually.
0

From this answer by removing the zero element :

def mode(arr):
    """
    Function: mode, to find the mode of an array.
    ---
    Parameters:
    @param: arr, nd array, any.
    ---
    @return: the mode value (whatever int/float/etc) of this array.
    """
    vals,counts = np.unique(arr, return_counts=True)
    if 0 in vals:
        z_idx = np.where(vals == 0)
        vals   = np.delete(vals,   z_idx)
        counts = np.delete(counts, z_idx)
    index = np.argmax(counts)
    return vals[index]

Comments

0

Inspired by this answer, you can use stats.mode with np.nan

import numpy as np
from scipy import stats

a = np.array([
    [1, 0, 4, 2, 2, 7],
    [5, 2, 0, 1, 4, 1],
    [3, 3, 2, 0, 1, 1]]
)
nonzero_a = np.where(a==0, np.nan, a)
mode, count = stats.mode(nonzero_a,axis=1, nan_policy='omit')

And you will get the result

mode:

masked_array(
  data=[[2.],
        [1.],
        [1.]],
  mask=False,
  fill_value=1e+20)

count:

masked_array(
  data=[[2.],
        [2.],
        [2.]],
  mask=False,
  fill_value=1e+20)

NOTE that if the values along the counting axis are all np.nan, the mode is undefined.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.