0

I have a function

def dist_to_center(ra_center,dec_center):
    # finding theta
    cos_ra = np.cos(ra_center-var1['ra'])
    cos_dec = np.cos(dec_center-var1['dec'])
    sin_dec = np.sin(dec_center)*np.sin(var1['dec'])

    theta = np.arccos((cos_ra*cos_dec)+sin_dec*(1-cos_ra))
    numerator = theta*comoving_dist
    denominator = 1+var1['zcosmo']

    # THE FINAL CALCULATED DISTANCE TO CENTRE
    dist_to_center = (numerator/denominator) 
    return dist_to_center

I want to make use of my processors, so I am using multiprocess pool like this:

if __name__ == '__main__':
    pool = Pool(processes=6)
    pool.map(dist_to_center, ra_center, dec_center) #calling the function with it's inputs
    pool.close()
    pool.join()

The code seems to be proper and is working, but only 1 processor is running instead of the 6 I have called. What am I doing wrong here?

7
  • What are ra_center.shape and dec_center.shape? Commented Oct 9, 2014 at 11:57
  • They both have the same values, i.e 443726 Commented Oct 9, 2014 at 11:58
  • You're saying that ra_center and dec_center are one-dimensional arrays when you call pool.map()? Commented Oct 9, 2014 at 12:01
  • @JohnZwinck ra_center and dec_center are numpy array's which are inputs for my function. Should I not call them in pool.map?? Because I thought pool.map takes the function and then it's inputs Commented Oct 9, 2014 at 12:03
  • Sure, but you're telling us you have only a single pair of arrays. So you would need to slice the arrays into multiple parts to calculate in parallel--the Pool has no idea how to divide your 440K element arrays into subparts properly. Commented Oct 9, 2014 at 12:05

1 Answer 1

1

You are passing a pair of one-dimensional arrays to the Pool. You need to slice the arrays yourself to make the Pool understand how to process them efficiently. For example:

def dist_to_center_mapper(arrays):
    return dist_to_center(arrays[0], arrays[1])

ra = np.split(ra_center, 6)
dec = np.split(dec_center, 6)
pool = Pool(processes=6)
pool.map(dist_to_center_mapper, zip(ra, dec)) 

I think the "mapper" function is required because Pool.map() takes only a single iterable of arguments. So we zip together the two lists of array slices so they get doled out together to the multiple processes. Note that you could split the arrays into more pieces than the number of processes if you want, if some pieces may take different amounts of time etc.

Sign up to request clarification or add additional context in comments.

5 Comments

So I just need to create a new function called dist_to_center_mapper and call that function in pool.map to make it work?
Something like that, because I think Pool.map wants to call a function taking a single argument. You could also edit your existing function to take a single argument. The argument it receives will be a tuple containing two sub-arrays, so you can see why my "mapper" function does what it does.
I tried the above by splitting the arrays, but still only one processor seems to be running! I would like to mention that all the arrays inside my function are numpy arrays with shape 443726. My function just does some math to give me a distance. Can I just say def dist_to_center()
Why don't you add a print statement to the top of dist_to_center() and see if it gets called in parallel or not.
I have opened a chat room, would it be possible for you to join, I can explain exactly what I am doing in the functino

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.