112

Is there a simple way of replacing all negative values in an array with 0?

I'm having a complete block on how to do it using a NumPy array.

E.g.

a = array([1, 2, 3, -4, 5])

I need to return

[1, 2, 3, 0, 5]

a < 0 gives:

[False, False, False, True, False]

This is where I'm stuck - how to use this array to modify the original array.

6 Answers 6

163

You are halfway there. Try:

In [4]: a[a < 0] = 0

In [5]: a
Out[5]: array([1, 2, 3, 0, 5])
Sign up to request clarification or add additional context in comments.

1 Comment

Benchmark and faster solution (according to my benchmark) can be seen on my answer
101

Try numpy.clip:

>>> import numpy
>>> a = numpy.arange(-10, 10)
>>> a
array([-10,  -9,  -8,  -7,  -6,  -5,  -4,  -3,  -2,  -1,   0,   1,   2,
         3,   4,   5,   6,   7,   8,   9])
>>> a.clip(0, 10)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

You can clip only the bottom half with clip(0).

>>> a = numpy.array([1, 2, 3, -4, 5])
>>> a.clip(0)
array([1, 2, 3, 0, 5])

You can clip only the top half with clip(max=n). (This is much better than my previous suggestion, which involved passing NaN to the first parameter and using out to coerce the type.):

>>> a.clip(max=2)
array([ 1,  2,  2, -4,  2])

Another interesting approach is to use where:

>>> numpy.where(a <= 2, a, 2)
array([ 1,  2,  2, -4,  2])

Finally, consider aix's answer. I prefer clip for simple operations because it's self-documenting, but his answer is preferable for more complex operations.

6 Comments

a.clip(0) would suffice since the OP just wants to replace negative values. a.clip(0, 10) would exclude anything above 10.
@Hiett - I just tried it and clip will take one. First is assumed min.
must be a version issue with numpy - heres my ouptut: (Pdb) np.clip(w,0) *** TypeError: clip() takes at least 3 arguments (2 given) - whereas: (Pdb) np.clip(w,0,1e6) array([[ 0. , 0.605]])
@Hiett, what version of numpy? Did you try the clip method of a? The built-in function numpy.clip gives me the same error, but the method does not.
yeh if you call it that way round it seems to work, e.g. p w.clip(0) array([[ 0. , 0.605]]) - how queer?
|
10

Another minimalist Python solution without using numpy:

[0 if i < 0 else i for i in a]

No need to define any extra functions.

a = [1, 2, 3, -4, -5.23, 6]
[0 if i < 0 else i for i in a]

yields:

[1, 2, 3, 0, 0, 6]

4 Comments

that is nice - i was wondering what the syntax would be to put the if statement inside the list comprehension - i was going wrong by sticking it after the for loop and only then getting two values back, e.g. [0, 0] for your example list
I did the same when I originally learned about list comprehension and was trying out different things to test my understanding - it seemed more intuitive to put it after the for loop for me too. Now, though, this does :) Putting it before the for applies it to every element of the list, putting it after, means only if the condition is met does it go into the resulting list.
@Hiett It's just using the ternary operator (i < 0 ? 0 : i in C) inside a list comprehension. Put brackets in to make it clearer [(0 if i < 0 else i) for i in a]. Putting the if after is using the filter part of the list expression construct. [(i) for i in a if i < 0] will only return a list of items that are less than zero.
Numpy is powerful because it does a lot of the computation by compiled c code and is thus faster. Comparing this method to the others, I find almost a 10 times speed factor difference (this is slower). So while intuitive and easy to read, this is definitely not for the computationally intensive.
4

And yet another possibility:

In [2]: a = array([1, 2, 3, -4, 5])

In [3]: where(a<0, 0, a)
Out[3]: array([1, 2, 3, 0, 5])

Comments

3

Benchmark using numpy:

%%timeit
a = np.random.random(1000) - 0.5
b = np.maximum(a,0)
# 18.2 µs ± 204 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
a = np.random.random(1000) - 0.5
a[a < 0] = 0
# 19.6 µs ± 304 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
a = np.random.random(1000) - 0.5
b = np.where(a<0, 0, a)
# 21.1 µs ± 134 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
a = np.random.random(1000) - 0.5
b = a.clip(0)
# 37.7 µs ± 124 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Supprisingly, np.maximum beat @NPE answer.


Caveat:

  1. os[os < 0] = 0 is faster than np.where() but not supported by numba. But whatever, np.maximum() is the fastest that I found.

  2. np.maximum() is different from np.max() and np.amax(). np.maximum() can compare vector with single value.

Comments

2

Here's a way to do it in Python without NumPy. Create a function that returns what you want and use a list comprehension, or the map function.

>>> a = [1, 2, 3, -4, 5]

>>> def zero_if_negative(x):
...   if x < 0:
...     return 0
...   return x
...

>>> [zero_if_negative(x) for x in a]
[1, 2, 3, 0, 5]

>>> map(zero_if_negative, a)
[1, 2, 3, 0, 5]

1 Comment

had gone down this route but thought there must be an easier, more matlab less python way to do it with numpy (as i was using arrays rather than lists anyway). clip is perfect

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.