-2

I have an array [1,2,3,4,5,6,10,100,200] What I want is to remove the 2 largest numbers outliers in the array. The result should be [1, 2, 3, 4, 5, 6, 10].

I tried this but its not working. Anyone can help me please?

arr = [1,2,3,4,5,6,10,100,200]

elements = numpy.array(arr)

mean = numpy.mean(elements, axis=0)
sd = numpy.std(elements, axis=0)

final_list = [x for x in arr if (x > mean - 2 * sd)]
final_list = [x for x in final_list if (x < mean + 2 * sd)]
print(final_list)
1
  • why are you using the mean/standard deviation if you simply want to filter based on the second maximum? Commented Jan 15, 2024 at 10:46

1 Answer 1

1

If you want to remove all items greater or equal to the second largest, use partition and boolean indexing:

elements = np.array([1,2,3,4,5,6,10,100,200])

N = 2
out = elements[elements < np.partition(elements, -N)[-N]]

If you only want to remove the largest two, even if there can be a tie and more than 2 items that are above the threshold rather use argsort+argpartition:

N = 2
out = elements[np.argsort(np.argpartition(elements, -N))<elements.shape[0]-N]
# variant
# out = elements[np.argsort(np.argpartition(-elements, N))>=N]

Output:

array([ 1,  2,  3,  4,  5,  6, 10])
difference of behavior
# elements
array([  1,   2,   3,   4,   5,   6, 100, 100, 200,  10])

# elements[elements < np.partition(elements, -N)[-N]]
array([ 1,  2,  3,  4,  5,  6, 10])

# elements[np.argsort(np.argpartition(elements, -N))<elements.shape[0]-N]
array([  1,   2,   3,   4,   5,   6, 100,  10])
Sign up to request clarification or add additional context in comments.

2 Comments

i have 2 integers , how can i concatenate them with space? eg: 65 3? how to do that?
@babe_engineer I don't understand your question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.