Find minimum for every row in array with unique columns

Question

I need to find the row-wise minimum in an array where each minimum must stem from a unique column.

np.min(arr, axis=1) provides the row-wise minimum but might contain the same column several times.

For example, given:

a = np.array([
    [4, 5, 6],
    [1, 2, 3],
    [7, 8, 9]
])

np.min(a, axis=1) would output: [4, 1, 7]

All the returned minima stem from the first column, but with the constraint that each column can be used only once, the desired output would be [5, 1, 9] as the most optimal assignment:

1 is the lowest value in this example and is therefore assigned to the first column. 5 is then the best minimum that can be assigned to the second column (as the second row was already used).

The only idea I have right now is to implement this using some kind of recursion (which is most likely very time-intensive, right?).

Why would the desired output not be [4, 2, 9], or [6, 1, 8], or [5, 3, 7]? — mkrieger1
– mkrieger1, Commented Aug 9, 2021 at 22:46
Those would not be the most optimal assignments. E.g. 1 is the lowest value in this example and is therefore assigned to the first column (instead of 2 or 3). 5 is then the best minimum that can be assigned to the second column (as the second row was already used). — Grotexo
– Grotexo, Commented Aug 9, 2021 at 22:55
I'm still not entirely clear about the problem statement. I know that (1) you want an array containing the minimum value for each row, except each column can only be used once; (2) the goal is to find the smallest solution; (3) you describe picking the first number by picking the smallest in the grid ("First pick the 1 in the first row, since 1 is the lowest value in the grid [...]"). But what about the input [[4, 5, 6],[1, 2, 3],[7, 101, 100]]? If you always pick the smallest number in the grid, you end up with the solution [5,1,100], but the answer [2,6,7] has a smaller sum. — Oli
– Oli, Commented Aug 10, 2021 at 1:23

Woodford · Accepted Answer · 2021-08-10 17:19:11Z

It appears that what you're looking for are N minimum values where the row and column index for each value is unique (assuming an NxN matrix). If we tag each value in the matrix with it's initial coordinates, we can rearrange them without losing the ability to tell where they came from. I'm not sure there's a slick way to sort with a custom key in numpy, so here's a vanilla Python solution with no recursion or backtracking required:

def idx_matrix(matrix):
    # return 1-D array of matrix values in (row_idx, col_idx, val) format
    return [(r, c, val) for r, row in enumerate(matrix)
                        for c, val in enumerate(row)]

def find_minima(indexed_vals, limit=0):
    # return array of indexed matrix values whose row and col indexes are unique
    minima = []
    rows = set()
    cols = set()
    for row, col, val in indexed_vals:
        if row not in rows and col not in cols:
            minima.append((row, col, val))
            if limit and len(minima) == limit:
                # optional optimization if you want to break off early
                # after you've found a value for every row
                break
            rows.add(row)
            cols.add(col)
    return minima

def sort_by_val(indexed_vals):
    # return indexed_vals sorted by original matrix value
    return sorted(indexed_vals, key=lambda x: x[2])

def sort_by_row(indexed_vals):
    # return indexed_vals sorted by row index
    return sorted(indexed_vals, key=lambda x: x[0])

def strip_indices(indexed_vals):
    # return a 1-D array with row and col index removed
    return [v[2] for v in indexed_vals]

def find_minima_by_row(matrix):
    # put it all together
    indexed = idx_matrix(matrix)
    indexed = sort_by_val(indexed)
    minima = find_minima(indexed)
    minima = sort_by_row(minima)
    return strip_indices(minima)

matrix = [[4, 5, 6],
          [1, 2, 3],
          [7, 8, 9]]
results = find_minima_by_row(matrix)
print(f'{results=}')

matrix = [[20, 17,  5, 13, 19],
          [11, 22,  8,  4,  9],
          [ 0, 10,  2, 16, 23],
          [ 1, 24, 21, 15, 14],
          [ 3, 12,  6,  7, 18]]
results = find_minima_by_row(matrix)
print(f'{results=}')

results=[5, 1, 9]
results=[5, 4, 0, 14, 12]

This ran in ~4 seconds with a 2000x2000 matrix on my workstation. You could do the sorts in place to be a little more space efficient.

I also see no reason why this wouldn't work if there are repeated values in the input.

Tashi · Accepted Answer · 2021-10-20 14:49:41Z

0

An easier way (may not be faster) to do the above using numpy and if you know the upper limit of the array values:

# matrix is a numpy 2-D array, k is some arbirtary large value
mapping = [None]*matrix.shape[0] # row-column mapping for minimum value
tmp = matrix.copy()
for i in range(matrix.shape[0]):
        u, v = divmod(tmp.argmin(), tmp.shape[1])
        mapping[u]=v
        tmp[u,:]=k
        tmp[:,v]=k

answered Oct 20, 2021 at 14:49

Tashi

1181 silver badge6 bronze badges

Collectives™ on Stack Overflow

Find minimum for every row in array with unique columns

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related