0

I am trying to learn to use Python's Multiprocessing module. To do so, I am trying to solve problems from ProjectEuler.net as they are good candidates for parallel processing. The code below is based on problem 14, which asks you to find the longest Collatz Chain below 1,000,000. My initial solution was to fling off 1,000,000 processes and try to solve them --that was a disaster. My second attempt (the solution below), breaks the problem down in to numProcs chunks.

My question is, are there ways to make this run better? Is there a way to do this without breaking the problem down in to chunks beforehand, i.e. is there a way to dynamically task Processors. In plain English, how do I say "OOoo! Processor 1 finished its problem, let's give it another." Rather than "Here is your total workload Processor 1."?

def run2(j):
    """
    A multicore implementation of the collatz sequence
    j - the largest number whose Collatz Chain we need to check
    """
    from multiprocessing import Process, Value, cpu_count
    import math

    def bazinga(nums, maxChain, biggest):
        """
        generates the length of the collatz chain for each
        number in list 'nums', it determines the longest chain
        and the number that determines the longest chain and 
        writes them to shared memory 'maxChain' and 'biggest'
        """
        for i in nums:
            a = len(generateChain(i, []))
            if a >= maxChain.value:
                biggest.value = i
                maxChain.value = a   

    numProcs = cpu_count()
    chunksize = math.ceil(j/numProcs) #Total number to solve / #cpus

    maxChainSize = Value('I', 0)
    largestChainProducer = Value('I', 0)

    #generate the "chunks"
    numList = []
    for i in range(numProcs): numList.append([])
    for i in range(1, j):
        numList[i%numProcs].append(i) #allocate the numbers in to buckets

    ps = []
    for i in range(numProcs):
        p = Process(target=bazinga, args=(numList[i], maxChainSize, largestChainProducer))
        p.start()
        ps.append(p)
    for p in ps: p.join()

    print('%s has a chain of length %s' % (largestChainProducer.value, maxChainSize.value))
1

1 Answer 1

1

Try using queues to pass jobs into the running processes and results back. Your running process will pop its next job off the queue when it is ready to do so.

There are some good examples in the python multiprocessing docs.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.