2

I'm trying to solve Problem 8 in project euler with multi-threading technique in python.

Find the greatest product of five consecutive digits in the 1000-digit number. The number can be found here.

My approach is to generate product from chunks of 5 from the original list and repeat this process 5 times, each with the starting index shifted one to the right.

Here is my thread class

class pThread(threading.Thread):
    def __init__(self, l):
        threading.Thread.__init__(self)
        self.l = l
        self.p = 0

    def run(self):

        def greatest_product(l):
        """
        Divide the list into chunks of 5 and find the greatest product
        """
            def product(seq):
                return reduce(lambda x,y : x*y, seq)

            def chunk_product(l, n=5):
                for i in range(0, len(l), n):
                    yield product(l[i:i+n])

            result = 0
            for p in chunk_product(num):
                result = result > p and result or p 

            return result

        self.p = greatest_product(self.l)

When I try to create 5 threads to cover all 5-digit chunks in my original list, the manual approach below gives the correct answer, with num being the list of single-digit numbers that I parse from the text:

thread1 = pThread(num)
del num[0]
thread2 = pThread(num)
del num[0]
thread3 = pThread(num)
del num[0]
thread4 = pThread(num)
del num[0]
thread5 = pThread(num)

thread1.start()
thread2.start()
thread3.start()
thread4.start()
thread5.start()

thread1.join()
thread2.join()
thread3.join()
thread4.join()
thread5.join()

def max(*args):
    result = 0
    for i in args:
        result = i > result and i or result
    return result

print max(thread1.p, thread2.p, thread3.p, thread4.p, thread5.p)

But this doesn't give the correct result:

threads = []
for i in range(0, 4):
    tmp = num[:]
    del tmp[0:i+1]
    thread = pThread(tmp)
    thread.start()
    threads.append(thread)

for i in range(0, 4):
    threads[i].join()

What did I do wrong here? I'm very new to multithreading so please be gentle.

6
  • 1
    Tips: rewrite your code without using del, it will be much easier to understand why it didn't work, and why your original code also isn't correct. Also, given the GIL, this code is unlikely to be run any faster than a single threaded version. Commented Feb 15, 2013 at 1:37
  • I've rewritten the code without del, and yeah, it breaks both versions. I'll study it again later today. Thank you for your suggestion. Commented Feb 15, 2013 at 1:47
  • 1
    threading this is seriously overkill. you can do it in 1 line with max(reduce(op.mul, n_list[i:i+5]) for i in xrange(1000)) Commented Feb 15, 2013 at 2:08
  • Thank you for the one-liner. Still, I'd like to know why deleting an element in a list gives a different result than passing a slice from the same list. But yeah, thanks to @LieRyan now I think it has nothing to do with thread. Commented Feb 15, 2013 at 2:30
  • 1
    On a side note if you're trying to make it threaded to use more than 1 core on a multicore cpu, you want to use the multiprocessing module, not threading. Commented Feb 15, 2013 at 2:55

2 Answers 2

4

There are 3 problems:

  1. The first is that the "manual" approach does not give the correct answer. It just happens that the correct answer to the problem is at the offset 4 from the start of your list. You can see this by using:

    import operator as op
    print max(reduce(op.mul, num[i:i+5]) for i in range(1000))
    for k in range(5):
        print max(reduce(op.mul, num[i:i+5]) for i in range(k, 1000, 5))
    

    One problem with your "manual" approach is that the threads share the num variable, each has the same list. So when you do del num[0], all threadX.l are affected. The fact that you consistently get the same answer is due to the second problem.

  2. The line

    for p in chunk_product(num):
    

    should be:

    for p in chunk_product(l):
    

    since you want to use the parameter of function greatest_product(l) and not the global variable num.

  3. In the second method you only spawn 4 threads since the loops range over [0, 1, 2, 3]. Also, you want to delete the values tmp[0:i] and not tmp[0:i+1]. Here is the code:

    threads = []
    for i in range(5):
        tmp = num[:]
        del tmp[0:i]
        thread = pThread(tmp)
        thread.start()
        threads.append(thread)
    
    for i in range(5):
        threads[i].join()
    
    print len(threads), map(lambda th: th.p, threads)
    print max(map(lambda th: th.p, threads))
    
Sign up to request clarification or add additional context in comments.

3 Comments

@JasonWhite That is the intended functionality, since each thread loops over the list with a different offset. In your example (with 2 threads), thread 1 will compute products for [1,2] [3,4] [5,6] [7,8] [9,10] and thread 2 will compute products for [2,3] [4,5] [6,7] [8,9]. This is because thread 2 has self.l = [2,3,4,5,6,7,8,9,10]. I think that this is what the OP wanted to implement.
Ah I see what you're saying. Deleted my previous comment as I read it wrong ;P
THANK YOU SOOOOOO MUCH. Do you happen to know how to award bounty? I really want to give you 100 rep for your answer.
1

I took a stab at this mainly to get some practice multiprocessing, and to learn how to use argparse.

This took around 4-5 gigs of ram just in case your machine doesn't have a lot.

python euler.py -l 50000000 -n 100 -p 8

Took 5.836833333969116 minutes
The largest product of 100 consecutive numbers is: a very large number

If you type python euler.py -h at the commandline you get:

usage: euler.py [-h] -l L [L ...] -n N [-p P]

Calculates the product of consecutive numbers and return the largest product.

optional arguments:
  -h, --help    show this help message and exit
  -l L [L ...]  A single number or list of numbers, where each # is seperated
                by a space
  -n N          A number that specifies how many consecutive numbers should be
                multiplied together.
  -p P          Number of processes to create. Optional, defaults to the # of
                cores on the pc.        

And the code:

"""A multiprocess iplementation for calculation the maximum product of N consecutive
numbers in a given range (list of numbers)."""

import multiprocessing
import math
import time
import operator
from functools import reduce
import argparse

def euler8(alist,lenNums):
    """Returns the largest product of N consecutive numbers in a given range"""
    return max(reduce(operator.mul, alist[i:i+lenNums]) for i in range(len(alist)))

def split_list_multi(listOfNumbers,numLength,threads):
    """Split a list into N parts where N is the # of processes."""
    fullLength = len(listOfNumbers)
    single = math.floor(fullLength/threads)
    results = {}
    counter = 0
    while counter < threads:
        if counter == (threads-1):
            temp = listOfNumbers[single*counter::]
            if counter == 0:
                results[str(counter)] = listOfNumbers[single*counter::]
            else:
                prevListIndex = results[str(counter-1)][-int('{}'.format(numLength-1))::]
                newlist = prevListIndex + temp
                results[str(counter)] = newlist
        else:
            temp = listOfNumbers[single*counter:single*(counter+1)]
            if counter == 0:
                newlist = temp
            else:
                prevListIndex = results[str(counter-1)][-int('{}'.format(numLength-1))::]
                newlist = prevListIndex + temp
            results[str(counter)] = newlist
        counter += 1
    return results,threads

def worker(listNumbers,number,output):
    """A worker. Used to run seperate processes and put the results in the queue"""
    result = euler8(listNumbers,number)
    output.put(result)

def main(listOfNums,lengthNumbers,numCores=multiprocessing.cpu_count()):
    """Runs the module.
    listOfNums must be a list of ints, or single int
    lengthNumbers is N (an int) where N is the # of consecutive numbers to multiply together
    numCores (an int) defaults to however many the cpu has, can specify a number if you choose."""

    if isinstance(listOfNums,list):
        if len(listOfNums) == 1:
            valuesToSplit = [i for i in range(int(listOfNums[0]))]
        else:
            valuesToSplit = [int(i) for i in listOfNums]
    elif isinstance(listOfNums,int):
        valuesToSplit = [i for i in range(listOfNums)]
    else:
        print('First arg must be a number or a list of numbers')

    split = split_list_multi(valuesToSplit,lengthNumbers,numCores)
    done_queue = multiprocessing.Queue()
    jobs = []
    startTime = time.time()

    for num in range(split[1]):
        numChunks = split[0][str(num)]
        thread = multiprocessing.Process(target=worker, args=(numChunks,lengthNumbers,done_queue))
        jobs.append(thread)
        thread.start()

    resultlist = []
    for i in range(split[1]):
        resultlist.append(done_queue.get())

    for j in jobs:
        j.join()

    resultlist = max(resultlist)
    endTime = time.time()
    totalTime = (endTime-startTime)/60
    print("Took {} minutes".format(totalTime))

    return print("The largest product of {} consecutive numbers is: {}".format(lengthNumbers, resultlist))            

if __name__ == '__main__':
    #To call the module from the commandline with arguments
    parser = argparse.ArgumentParser(description="""Calculates the product of consecutive numbers \
    and return the largest product.""")
    parser.add_argument('-l', nargs='+', required=True,
                       help='A single number or list of numbers, where each # is seperated by a space')
    parser.add_argument('-n', required=True, type=int,
                        help = 'A number that specifies how many consecutive numbers should be \
                        multiplied together.')
    parser.add_argument('-p', default=multiprocessing.cpu_count(), type=int,
                       help='Number of processes to create. Optional, defaults to the # of cores on the pc.')
    args = parser.parse_args()
    main(args.l, args.n, args.p)

1 Comment

Thank you and please don't delete this answer. I'll get back to you once I can understand everything you wrote.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.