1

How could I write a function in numpy where Set and Numbers are in relation to each other and if the index of numbers is equivalent to L in Set the Numbers values are going to be multiplied by Lval otherwise Numbers are going to be multiplied by Uval. I am essentially trying to modify the vanilla python code to numpy version.

Set = np.array(['U' 'L' 'U' 'U' 'L' 'U' 'L' 'L' 'U' 'L' 'U' 'L' 'L' 'L' 'L'])
Numbers = np.array([ 52599  52599  53598 336368 336875 337466 338292 356587 357474 357763 358491 358659 359041 360179 360286])
Lval = 30
Uval = 10

Vanilla Python Version

Val = []
for x in Set:
   if Set[x] == 'U':
       calc = Numbers[x] * Uval
       Val.append(calc)
   else:
       calc = Numbers[x] * Uval
       Val.append(calc)
4
  • I'm pretty sure that this code will raise a syntax error. Commented Jan 31, 2021 at 6:38
  • 3
    Please always provide a minimal reproducible example. Don't be lazy Commented Jan 31, 2021 at 6:39
  • Lookup numpy.where . Commented Jan 31, 2021 at 6:40
  • BTW your Python version is wrong. x is a string, it would throw a type error with Set[x] and Numbers[x], iterating over a container gives you the elements of the container, not the indices... Commented Jan 31, 2021 at 6:56

1 Answer 1

4

You are looking for numpy.where:

Edit to bring the multiplication out so as to only do it once as suggested by @fountainhead in the comments

Numbers * np.where(Set == 'U', Uval, Lval)
numpy.where(condition[, x, y])

Return elements chosen from x or y depending on condition

So, for example:

>>> import numpy as np
>>> Set = np.array(['U', 'L', 'U', 'U', 'L', 'U', 'L', 'L', 'U', 'L', 'U', 'L', 'L', 'L', 'L'])
>>> Numbers = np.array([ 52599,  52599,  53598, 336368, 336875, 337466, 338292, 356587, 357474, 357763, 358491, 358659, 359041, 360179, 360286])
>>> Lval = 30
>>> Uval = 10
>>> Numbers * np.where(Set == 'U', Uval, Lval)
array([  525990,  1577970,   535980,  3363680, 10106250,  3374660,
       10148760, 10697610,  3574740, 10732890,  3584910, 10759770,
       10771230, 10805370, 10808580])

One caveat, you end up using a lot of extra space, since you have to create the array Set == 'U', to pass it to the condition parameter numpy.where, and an intermediate array of Uval and Lvals. (and potentially other arrays to pass as the x and y parameters of numpy.where).

Despite all the unnecessary intermediates, it is still quite fast:

>>> Numbers = np.repeat(Numbers, 1000)
>>> import timeit
>>> timeit.timeit("Numbers * np.where(Set == 'U', Uval, Lval)", "from __main__ import np, Set, Numbers, Lval, Uval", number=10000)
1.067108618999555

The equivalent Python:

>>> setlist = Set.tolist()
>>> numberlist = Numbers.tolist()
>>> timeit.timeit("[n*Uval if s =='U' else n*Lval for s, n in zip(setlist, numberlist)]", "from __main__ import setlist, numberlist, Lval, Uval", number=10000)
10.844363432000023
Sign up to request clarification or add additional context in comments.

5 Comments

Wouldn't it be simpler and more efficient to do Numbers * np.where(Set == 'L', Lval, Uval) , and let broadcasting take care of everything ? There'd be fewer multiplications.
@fountainhead yes, absolutely
@fountainhead interestingly, I didn't a significant speed increase, but it is more elegant nonetheless
Hello there thanks for the code I have another issue relating to the np.where function. If you could take a look at this issue I would appreciate it: stackoverflow.com/questions/65999759/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.