Code where most of the execution time is spent performing two independent function evaluations is an obvious candidate to use two CPUs. I know how to do it in Python with multiprocessing, but only with the idiom if __name__ == '__main__': added to the entry point of the program. Is there a simpler way in modern Python (3.8.3 at time of writing)? Nothing there seems suitably simple.
Requirements: no change to the calling code, and no separate file. It's fine to import some helper, but I would prefer not to pip.
Example application and benchmark, in the field of cryptography:
def rsacrt(p,q,dp,dq,qi,x):
# most of the time is spent in the following two lines
u = pow(x,dp,p)
v = pow(x,dq,q)
return (u-v)*qi%p*q+v
# test and benchmark the above
import time
e,p,q = 3, 5**3528+12436, 7**2918+27562
n,dp,dq,qi = p*q, pow(e,-1,p-1), pow(e,-1,q-1), pow(q,-1,p)
x = 42
t = time.time()
y = rsacrt(p,q,dp,dq,qi,x)
t = time.time()-t
if pow(y,e,n)!=x: print("# wrongo, spasmoid!")
print("duration of rsacrt:",(int)(t*1000.),"ms")
The operation shown is the one bottleneck in RSA signature generation, and RSA decryption. Parameters are deliberately high (16384-bit RSA, rather than the usual 2048-bit), so the execution time is in the order of seconds, with >98% in the two first pow. This is meant to illustrate a real-life case where parallel execution matters, not as an example on how to do RSA: there are fast alternatives to pow, and this code lacks side-channel protection.
Note: This code requires a version of Python where pow can compute the modular inverse. That includes Python 3.8.x. Try it online!.
Addition: The code that works under Python 3 is sizably larger, see this other Try it online!.
Poolobject with itsmapmethod looks exactly suited to your needs.multiprocessing, and hate how it requires changes all over the code. I did not try Pool/map, but looking at this answer it is not good at CPU-bound tasks, and that's also the impression I get from the docs. I've heard of a "global interpreter lock". I'm out of my comfort zone, that's why I ask.ValueError: pow() 2nd argument cannot be negative when 3rd argument specifiedin linen,dp,dq.... Can you fix your example values? Also, I fear only two executions ofpowwill make it hard to pay back the overhead of spawning additional threads/processes/... . Can you quantify "most"? On my machine, running any parallel processing adds (one-time) overhead in the ms range. So unless you show the loop that calls rsacrt often, there is nothing to be gained at this level.if __name__ == '__main__':? It is only needed at the top-level of a module.