0

I need to calculate a most frequent element in a matrix based on the neighbor values and itself. I found a generic_filter function which I used to calculate what I wanted. So here is how I can do this for a 2d array

arr = np.array([
    [1, 2, 4],
    [5, 6, 7],
    [2, 4, 4]
])

def most_frequent(arr):
    def most_frequent(val):
        return Counter(val).most_common(1)[0][0]

    footprint = [[1, 1, 1], [1, 1, 1], [1, 1, 1]]
    return ndimage.generic_filter(arr, most_frequent, footprint=footprint, mode='constant')

print most_frequent(arr)

This returns me

[[0 0 0]
 [0 4 0]
 [0 0 0]]

Ignore the elements on the edge. As you see the middle element is 4 because this is a most frequent element among the neighbors and the value.


The big problem is that I need to do the same thing for a 3d matrix. So for a matrix like this

arr = np.array([
    [[1, 1], [2, 2], [4, 4]],
    [[5, 5], [6, 6], [7, 7]],
    [[2, 2], [4, 4], [4, 4]]
])

I expect to get [0, 0] everywhere and [4, 4] in the middle. This fails with RuntimeError('filter footprint array has incorrect shape.'). The worse thing that I have doubts that I can use generic_filter here because the docs say:

cval : scalar, optional Value to fill past edges of input if mode is ‘constant’.

So how can I solve my problem?

2
  • Does the 3D input array contain integers only? Commented Apr 7, 2016 at 9:09
  • One more question - You intend to run such a filter in a sliding manner across the entire 3D array that could be of any length and not just (3,3,2), right? Commented Apr 7, 2016 at 9:18

2 Answers 2

1

Here a fully vectored solution.

First make flattened neighborhoods:

(n,m,_)=M.shape 
(sn,sm,s2)=M.strides 
newshape=(n-2,m-2,9,2)
newstrides=(sn,sm,2*s2,s2)
neighborhoods=np.lib.stride_tricks.as_strided(M,newshape,newstrides)
"""
array([[[[1, 1],
         [2, 2],
         [4, 1],
         [1, 1],
         [5, 5],
         [6, 6],
         [7, 7],
         [2, 3],
         [2, 2]],

        [[2, 2],
         [4, 1],
         [1, 1],
         [5, 5],
         [6, 6],
         [7, 7],
         [2, 3],
         [2, 2],
         [4, 1]]]])
     """

Then you have to pack the two components to use np.unique which work with 1D arrays. assuming M.dtype is int32, you can do that by view:

packed_neighborhoods=np.ascontiguousarray(neighborhoods).view(int64)
In [5]: packed_neighborhoods.shape
Out[5]: (1, 2, 9, 1)

Now we define a function that take a 1D array and find the indice of the most frequent , based on np.unique:

def mostfreq(arr):
     _,index,counts=unique(arr, return_index=True, return_counts=True)
     return index[counts.argmax()]

Apply it on the good axis:

ind2=apply_along_axis(mostfreq,2,packed_neighborhoods).squeeze()

And there is the result, including other indices.

ind0,ind1=indices(neighborhoods.shape[:2])
print(neighborhoods[ind0,ind1,ind2])  
"""
[[[1 1]
  [4 1]]]
"""

But your solution has same performance at the moment ;(.

Sign up to request clarification or add additional context in comments.

6 Comments

wow, it looks like it work. I will take some time to investigate. How can you came up with the idea of changing a footprint in this way?
Because 'filter footprint array has incorrect shape.' . you have to add one dimension to fit the data ndim. Edited.
It looks like there is a problem there. For example for this array arr = np.array([ [[1, 1], [2, 2], [4, 1], [1, 1]], [[5, 5], [6, 6], [7, 7], [2, 3]], [[2, 2], [4, 1], [4, 4], [2, 3]] ]) one of the results is [2 1] but such element does not even exist in the array.
Yes , it's how I understand the problem . I thank you will work plane by plane, and each coordinate is for the corresponding plane.Apparently, you want to compare the vectors in the last dimension ?
I am looking for something that behaves like my solution. stackoverflow.com/a/36471245/1090562 But hopefully without these loops.
|
0

One way I have found how to achieve this is by doing something like

def most_frequent(M):
    x, y, _ = arr.shape
    res = np.zeros((x - 2, y - 2, 2))
    for i in xrange(1, x - 1):
        for j in xrange(1, y - 1):
            neighbors = [M[i - 1, j - 1], M[i - 1, j], M[i - 1, j + 1], M[i, j - 1], M[i, j], M[i, j + 1], M[i + 1, j - 1], M[i + 1, j], M[i + 1, j + 1]]
            res[i - 1, j - 1] = Counter([tuple(_) for _ in neighbors]).most_common(1)[0][0]

    return res

Still looking for a better solution (the one that does not involve my 2 loops).

1 Comment

Is M same as the input 3D array?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.