Well, there might be a way:
- create 6 offset arrays (left, right, up, down, front, back)
- combine these arrays into a (R-2, C-2, D-2, 6) 4D array
- sort the 4D array by the last dimension (the dimension with size 6)
Now you have a 4D array where you can pick a sorted vector of neighbours for each cell. After that you may count the different neighbours by:
- use
diff to the 4th axis (the sorted array)
- calculate the sum of non-zero differences along the 4th axis
This will give you the number of different neighbours - 1.
The first part is probably rather clear. If a cell has neighbours (1, 2, 4, 2, 2, 3), the neighbour vector is sorted into (1, 2, 2, 2, 3, 4). The difference vector is then (1, 0, 0, 1, 1), and the sum of non-zero elements ((diff(v) != 0).sum(axis=4)) gives 3. So, there are 4 unique neighbours.
Of course, this method does not take the edges into account. That you can solve by padding the initial array by 1 cell into each direction by numpy.pad with mode reflect. (That mode is actually the only one that is guaranteed not to introduce any new values into the neighbourhood, try it with a two-dimensional array to understand why.)
For example:
import numpy as np
# create some fictional data
dat = np.random.randint(1, 8, (6, 7, 8))
# pad the data by 1
datp = np.pad(dat, 1, mode='reflect')
# create the neighbouring 4D array
neigh = np.concatenate((
datp[2:,1:-1,1:-1,None], datp[:-2,1:-1,1:-1,None],
datp[1:-1,2:,1:-1,None], datp[1:-1,:-2,1:-1,None],
datp[1:-1,1:-1,2:,None], datp[1:-1,1:-1,:-2,None]), axis=3)
# sort the 4D array
neigh.sort(axis=3)
# calculate the number of unique samples
usamples = (diff(neigh, axis=3) != 0).sum(axis=3) + 1
The solution above is quite universal, it works with anything sortable. However, it consumes a lot of memory (6 copies of the array) and is not a high-performance solution. If we are satisfied with a solution that only works with this special case (values are very small integers), we can do some bit magic.
- create an array where every item is represented as a bit mask (1 = 00000001, 2 = 00000010, 3 = 00000100, etc.)
- OR the neighbouring arrays together
- count the number of bits in the ORed result by using a look-up table
.
import numpy as np
# create a "number of ones" lookup table
no_ones = np.array([bin(i).count("1") for i in range(256)], dtype='uint8')
# create some fictional data
dat = np.random.randint(1, 8, (6, 7, 8))
# create a bit mask of the cells
datb = 1 << dat.astype('uint8')
# pad the data by 1
datb = np.pad(datb, 1, mode='reflect')
# or the padded data together
ored = (datb[ 2:, 1:-1, 1:-1] |
datb[:-2, 1:-1, 1:-1] |
datb[1:-1, 2:, 1:-1] |
datb[1:-1, :-2, 1:-1] |
datb[1:-1, 1:-1, 2:] |
datb[1:-1, 1:-1, :-2])
# get the number of neighbours from the LUT
usamples = no_ones[ored]
The performance impact is rather significant. The first version takes 2.57 s and the second version 283 ms on my machine with a 384 x 384 x 100 table (excluding creating the random data). This translates into 19 ns and 174 ns/cell, respectively.
This solution is however limited to the case where there is a reasonable number of different (and known) values. If the number of different possible values grows above 64, the bit magic loses its charm. (Also, at around 20 different values the look-up part has to be split into more than one operation do to the memory consumption of the LUT. The LUT should fit into the CPU cache, otherwise it becomes slow.)
On the other hand, expanding the solution to use the full 26-neighbourhood is simple and quite fast.