17

How can I get the second minimum value from each column? I have this array:

A = [[72 76 44 62 81 31]
     [54 36 82 71 40 45]
     [63 59 84 36 34 51]
     [58 53 59 22 77 64]
     [35 77 60 76 57 44]]

I wish to have output like:

A = [54 53 59 36 40 44]
1
  • second minimum per column? Commented Mar 11, 2020 at 12:24

6 Answers 6

12

Try this, in just one line:

[sorted(i)[1] for i in zip(*A)]

in action:

In [12]: A = [[72, 76, 44, 62, 81, 31], 
    ...:      [54 ,36 ,82 ,71 ,40, 45], 
    ...:      [63 ,59, 84, 36, 34 ,51], 
    ...:      [58, 53, 59, 22, 77 ,64], 
    ...:      [35 ,77, 60, 76, 57, 44]] 

In [18]: [sorted(i)[1] for i in zip(*A)]                                                                                                                                                                           
Out[18]: [54, 53, 59, 36, 40, 44]

zip(*A) will transpose your list of list so the columns become rows.

and if you have duplicate value, for example:

In [19]: A = [[72, 76, 44, 62, 81, 31], 
    ...:  [54 ,36 ,82 ,71 ,40, 45], 
    ...:  [63 ,59, 84, 36, 34 ,51], 
    ...:  [35, 53, 59, 22, 77 ,64],   # 35
    ...:  [35 ,77, 50, 76, 57, 44],]  # 35

If you need to skip both 35s, you can use set():

In [29]: [sorted(list(set(i)))[1] for i in zip(*A)]                                                                                                                                                                
Out[29]: [54, 53, 50, 36, 40, 44]
Sign up to request clarification or add additional context in comments.

Comments

8

Operations on numpy arrays should be done with numpy functions, so look at this one:

np.sort(A, axis=0)[1, :]
Out[61]: array([54, 53, 59, 36, 40, 44])

2 Comments

This has to be the best solution as far as I know, it keeps everything in numpy, I think the lambda must slow down heapq.nsmallest solution. Seems best to keep everything fast in numpy
Unless you have duplicate minimum values ...
5

you can use heapq.nsmallest

from heapq import nsmallest

[nsmallest(2, e)[-1] for e in zip(*A)]

output:

[54, 53, 50, 36, 40, 44]

I added a simple benchmark to compare the performance of the different solutions already posted:

enter image description here

from simple_benchmark import BenchmarkBuilder
from heapq import nsmallest


b = BenchmarkBuilder()

@b.add_function()
def MehrdadPedramfar(A):
    return [sorted(i)[1] for i in zip(*A)]

@b.add_function()
def NicolasGervais(A):
    return np.sort(A, axis=0)[1, :]

@b.add_function()
def imcrazeegamerr(A):
    rotated = zip(*A[::-1])

    result = []
    for arr in rotated:
        # sort each 1d array from min to max
        arr = sorted(list(arr))
        # add the second minimum value to result array
        result.append(arr[1])

    return result

@b.add_function()
def Daweo(A):
    return np.apply_along_axis(lambda x:heapq.nsmallest(2,x)[-1], 0, A)

@b.add_function()       
def kederrac(A):
    return [nsmallest(2, e)[-1] for e in zip(*A)]


@b.add_arguments('Number of row/cols (A is  square matrix)')
def argument_provider():
    for exp in range(2, 18):
        size = 2**exp
        yield size, [[randint(0, 1000) for _ in range(size)] for _ in range(size)]

r = b.run()
r.plot()

Using zip with sorted function is the fastest solution for small 2d lists while using zip with heapq.nsmallest shows to be the best on big 2d lists

2 Comments

Just a wild thought: can these results be impacted by the fact that you generated numbers that are not numpy dtypes? Also, won't the built in randint return a list instead of an array?
is it the only way iterating over the rows on the np.matrix? is there a faster alternative ?
1

I hope I understood your question correctly but either way here's my solution, im sure there is a more elegent way of doing this but it works

A = [[72,76,44,62,81,31]
 ,[54,36,82,71,40,45]
 ,[63,59,84,36,34,51]
 ,[58,53,59,22,77,64]
 ,[35,77,50,76,57,44]]

#rotate the array 90deg
rotated = zip(*A[::-1])

result = []
for arr in rotated:
    # sort each 1d array from min to max
    arr = sorted(list(arr))
    # add the second minimum value to result array
    result.append(arr[1])
print(result)

enter image description here

Comments

0

Assuming that A is numpy.array (if this holds true please consider adding numpy tag to your question) then you might use apply_along_axis for that following way:

import heap
import numpy as np
A = np.array([[72, 76, 44, 62, 81, 31],
              [54, 36, 82, 71, 40, 45],
              [63, 59, 84, 36, 34, 51],
              [58, 53, 59, 22, 77, 64],
              [35, 77, 60, 76, 57, 44]])
second_mins = np.apply_along_axis(lambda x:heapq.nsmallest(2,x)[-1], 0, A)
print(second_mins)  # [54 53 59 36 40 44]

Note that I used heapq.nsmallest as it does as much sorting as required to get 2 smallest elements, unlike sorted which does complete sort.

Comments

0
>>> A = np.arange(30).reshape(5,6).tolist()
>>> A
[[0, 1, 2, 3, 4, 5], 
 [6, 7, 8, 9, 10, 11], 
 [12, 13, 14, 15, 16, 17], 
 [18, 19, 20, 21, 22, 23],
 [24, 25, 26, 27, 28, 29]]

Updated: Use set to prevent from duplicate and transpose list using zip(*A)

>>> [sorted(set(items))[1] for items in zip(*A)]
[6, 7, 8, 9, 10, 11]

old: second minimum item in each row

>>> [sorted(set(items))[1] for items in A]
[1, 7, 13, 19, 25]

1 Comment

Isn't that getting the second item in each row rather than column?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.