2

I need to find the row-wise minimum in an array where each minimum must stem from a unique column.

np.min(arr, axis=1) provides the row-wise minimum but might contain the same column several times.

For example, given:

a = np.array([
    [4, 5, 6],
    [1, 2, 3],
    [7, 8, 9]
])

np.min(a, axis=1) would output: [4, 1, 7]

All the returned minima stem from the first column, but with the constraint that each column can be used only once, the desired output would be [5, 1, 9] as the most optimal assignment:

1 is the lowest value in this example and is therefore assigned to the first column. 5 is then the best minimum that can be assigned to the second column (as the second row was already used).

The only idea I have right now is to implement this using some kind of recursion (which is most likely very time-intensive, right?).

13
  • 1
    Why would the desired output not be [4, 2, 9], or [6, 1, 8], or [5, 3, 7]? Commented Aug 9, 2021 at 22:46
  • Those would not be the most optimal assignments. E.g. 1 is the lowest value in this example and is therefore assigned to the first column (instead of 2 or 3). 5 is then the best minimum that can be assigned to the second column (as the second row was already used). Commented Aug 9, 2021 at 22:55
  • How large (number of elements) are your arrays going to be? Commented Aug 9, 2021 at 23:26
  • On average around 400x400, rarely up to 2000x2000 Commented Aug 9, 2021 at 23:38
  • 1
    I'm still not entirely clear about the problem statement. I know that (1) you want an array containing the minimum value for each row, except each column can only be used once; (2) the goal is to find the smallest solution; (3) you describe picking the first number by picking the smallest in the grid ("First pick the 1 in the first row, since 1 is the lowest value in the grid [...]"). But what about the input [[4, 5, 6],[1, 2, 3],[7, 101, 100]]? If you always pick the smallest number in the grid, you end up with the solution [5,1,100], but the answer [2,6,7] has a smaller sum. Commented Aug 10, 2021 at 1:23

2 Answers 2

1

It appears that what you're looking for are N minimum values where the row and column index for each value is unique (assuming an NxN matrix). If we tag each value in the matrix with it's initial coordinates, we can rearrange them without losing the ability to tell where they came from. I'm not sure there's a slick way to sort with a custom key in numpy, so here's a vanilla Python solution with no recursion or backtracking required:

def idx_matrix(matrix):
    # return 1-D array of matrix values in (row_idx, col_idx, val) format
    return [(r, c, val) for r, row in enumerate(matrix)
                        for c, val in enumerate(row)]

def find_minima(indexed_vals, limit=0):
    # return array of indexed matrix values whose row and col indexes are unique
    minima = []
    rows = set()
    cols = set()
    for row, col, val in indexed_vals:
        if row not in rows and col not in cols:
            minima.append((row, col, val))
            if limit and len(minima) == limit:
                # optional optimization if you want to break off early
                # after you've found a value for every row
                break
            rows.add(row)
            cols.add(col)
    return minima

def sort_by_val(indexed_vals):
    # return indexed_vals sorted by original matrix value
    return sorted(indexed_vals, key=lambda x: x[2])

def sort_by_row(indexed_vals):
    # return indexed_vals sorted by row index
    return sorted(indexed_vals, key=lambda x: x[0])

def strip_indices(indexed_vals):
    # return a 1-D array with row and col index removed
    return [v[2] for v in indexed_vals]

def find_minima_by_row(matrix):
    # put it all together
    indexed = idx_matrix(matrix)
    indexed = sort_by_val(indexed)
    minima = find_minima(indexed)
    minima = sort_by_row(minima)
    return strip_indices(minima)

matrix = [[4, 5, 6],
          [1, 2, 3],
          [7, 8, 9]]
results = find_minima_by_row(matrix)
print(f'{results=}')

matrix = [[20, 17,  5, 13, 19],
          [11, 22,  8,  4,  9],
          [ 0, 10,  2, 16, 23],
          [ 1, 24, 21, 15, 14],
          [ 3, 12,  6,  7, 18]]
results = find_minima_by_row(matrix)
print(f'{results=}')
results=[5, 1, 9]
results=[5, 4, 0, 14, 12]

This ran in ~4 seconds with a 2000x2000 matrix on my workstation. You could do the sorts in place to be a little more space efficient.

I also see no reason why this wouldn't work if there are repeated values in the input.

Sign up to request clarification or add additional context in comments.

Comments

0

An easier way (may not be faster) to do the above using numpy and if you know the upper limit of the array values:

# matrix is a numpy 2-D array, k is some arbirtary large value
mapping = [None]*matrix.shape[0] # row-column mapping for minimum value
tmp = matrix.copy()
for i in range(matrix.shape[0]):
        u, v = divmod(tmp.argmin(), tmp.shape[1])
        mapping[u]=v
        tmp[u,:]=k
        tmp[:,v]=k

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.