How to compose permutations in numpy efficiently?

Question

I have bijections/permutations on a 3d numpy array given as two 3 tuples of numpy arrays. For example my 3d numpy array could look like this

arr = np.array([[[3, 2, 1], [5, 0, 5], [2, 0, 1]],
                [[3, 4, 5], [4, 2, 0], [0, 1, 1]],
                [[2, 0, 5], [1, 5, 1], [0, 5, 1]],
                [[4, 3, 0], [1, 3, 3], [3, 3, 3]],
                [[2, 4, 0], [2, 1, 0], [4, 4, 4]],
                [[4, 3, 2], [2, 4, 2], [5, 5, 5]]])

And a permutation on it like this:

a, b = ((np.array([5, 2, 2, 2, 1, 1, 1, 4, 4, 4, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0]),
         np.array([0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 2, 1, 0, 0, 0, 1, 2, 2, 2, 1]),
         np.array([2, 2, 1, 0, 0, 0, 0, 0, 1, 2, 2, 2, 0, 1, 2, 2, 2, 1, 0, 0])),
        (np.array([4, 5, 5, 5, 2, 2, 2, 1, 1, 1, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0]),
         np.array([0, 2, 1, 0, 0, 0, 0, 0, 1, 2, 0, 0, 2, 1, 0, 0, 0, 1, 2, 2]),
         np.array([2, 2, 2, 2, 2, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 2, 2, 2, 1])))

I can apply the map by doing arr[a] = arr[b].

My question is: is there an efficient way to compose two of those bijections? For example I want a function compose for which the two following statements are equivalent

c, d = compose((a, b), (a, b))
arr[c] = arr[d]

and

arr[a] = arr[b]
arr[a] = arr[b]

I assume that a and b are uniquely defined, i.e., the coordinate 3-tuples in a are unique. — Quang Hoang
– Quang Hoang, Commented Nov 10, 2021 at 16:00
That being said, I'll be surprised if there is a simple answer, since you are overriding the original array after each iteration, so it's not just coordinate (index) dependent operations. — Quang Hoang
– Quang Hoang, Commented Nov 10, 2021 at 16:02

Mad Physicist · Accepted Answer · 2021-11-11 15:34:20Z

2

The idea here is very much like any other resampler: for each destination location, find the corresponding source location. Let's start with a simplified case. Say we have a 1-D array and corresponding indices:

arr1 = np.arange(6)
a1 = np.array([3, 4])
b1 = np.array([2, 3])

It's much easier to visualize what happens when you make the assignments:

arr[a1] = arr[b1] # [0, 1, 2, 2, 3, 5]
arr[a1] = arr[b1] # [0, 1, 2, 2, 2, 5]

For a destination index that is the entire array (e.g., np.arange(arr.size)), the source index is just the application of the assignment to the destination.

There are a couple of ways of generalizing this to a 3D array. One way would be to make a (*arr.shapes, arr.ndim) array containing a meshgrid of all the indices. Another is simply to convert a and b into raveled (linear) indices into the raveled version of arr. I recommend going with the latter.

a_r = np.ravel_multi_index(a, arr.shape)
b_r = np.ravel_multi_index(b, arr.shape)

You can construct a source and destination index pair of the same size as arr.ravel(), and set the sources and destinations in it:

dest = np.arange(arr.size)
src = np.arange(arr.size)
src[a] = src[b]

I wrote out the last assignment "in full", although the first time you can just do src[a] = b. You can iterate the last assignment as many times as you want. In your particular example, do it a second time:

src[a] = src[b]

If you want, you can trim off the elements that are not modified by the assignment:

mask = dest != src
dest = dest[mask]
src = src[mask]

Finally, you can unravel the index back to the original shape:

c = np.unravel_index(dest, arr.shape)
d = np.unravel_index(src, arr.shape)

If you want to write a function that accepts an arbitrary number of input indices, it might look something like this:

def compose(*args, shape=None, sparse=True):
    if shape is None:
        get_max = lambda tup: np.array([arr.max() for arr in tup])
        for a, b in args:
            if shape is None:
                shape = np.maximum(get_max(a), get_max(b))
            else:
                shape = np.maximum(shape, get_max(a))
                shape = np.maximum(shape, get_max(b))
        shape = tuple(shape + 1)

    size = np.prod(shape)
    dest = np.arange(size)
    src = np.arange(size)
    for a, b in args:
        a = np.ravel_multi_index(a, shape)
        b = np.ravel_multi_index(b, shape)
        src[a] = src[b]

    if sparse:
        mask = dest != src
        dest = dest[mask]
        src = src[mask]
    return np.unravel_index(dest, shape), np.unravel_index(src, shape)

answered Nov 11, 2021 at 15:34

Mad Physicist

116k29 gold badges202 silver badges292 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Quang Hoang Over a year ago

While I believe your solution is correct, I somehow still have a tiny doubt in modifying the data in-place. I mean, mathematically, moving an entry to another is just a linear mapping, and composition of those should be again one of those. I'm just not in the right mind to verify all that. +1

Mad Physicist Over a year ago

@QuangHoang. I started with a solution that split the arrays into portions where source and destination overlapped, tried iterating through that, only to realize that it was unnecessary. I finally settled on this version after inspecting a lot of test cases with just np.arange(...). It seems to work every time, both conceptually and in practice. I totally feel the part about not being able to verify. I don't really think I can either :)

Collectives™ on Stack Overflow

How to compose permutations in numpy efficiently?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related