Priority Queue: parallel processing

Question

I am trying to solve this algorithmic problem about priority queue on Coursera but the grader on their website keeps saying that my program failed with a time limit exceeded error. The fact is that when I run it on my PC with huge input (5000 threads, 100000 jobs), it works smoothly and prints the correct result in no more than 1 second.

This is the problem description:

enter image description here

This is the link to my code on Github: https://gist.github.com/giantonia/3ddbacddc7bd58b220ab592f802d9602

Any help appreciated !

you're asking for global code reviewing on something that does not answer a given problem statement, and as code example you're giving a full snippet. This is out of scope for stackoverflow, as it makes an answer hard to write, and your question too broad. As an advice follow the given constraints and set ulimit to 512MB. — zmo
– zmo, Commented Dec 16, 2016 at 15:14

Daniel kim · Accepted Answer · 2016-12-16 18:18:45Z

The weakest point of your code is below,

while len(jobs) > 0:
    if threads[0][1] <= time:
      ...
    else:
        time += 1

This loop will be executed along with the time, not the number of jobs have to be done. It requires O(MAX_T) cost! Too slow!

This is my solution regarding this problem. It requires O(N + MlgN)).

The idea is quite simple.

Construct priority_queue along with the earliest time to finish.
Pick next_thread from priority_queue and update its the time to finish the job.
Insert it into priority queue

Here is the code,

# python3

def parent_key_cal(key):
    if key % 2 == 0:
        parent_key = key//2
    else:
        parent_key = (key - 1)//2
    return parent_key

def swap(alist, key1, key2):
    temp = alist[key1]
    alist[key1] = alist[key2]
    alist[key2] = temp

def return_min_key(alist, parent, left, right, criteria):

    min_value = parent
    if alist[parent][criteria] > alist[left][criteria]:
        min_value = left
        if right != -1 and alist[min_value][criteria] > alist[right][criteria]:
            min_value = right
    elif alist[parent][criteria] < alist[left][criteria]:
        if right != -1 and alist[min_value][criteria] > alist[right][criteria]:
            min_value = right

    return min_value

def shift_up(alist, key):

    while key > 1:

        parent = parent_key_cal(key)
        if alist[parent][1] != alist[key][1]:
            if alist[parent][1] > alist[key][1]:
                swap(alist, parent, key)
                key = parent
            else:
                break
        else:
            if alist[parent][0] > alist[key][0]:
                swap(alist, parent, key)
                key = parent
            else:
                break

def shift_down(alist, key):

    if 2*key >= len(alist):
        return

    parent = key
    left = 2*key
    right = 2*key + 1

    if right >= len(alist):

        if (alist[parent] == alist[left]) == True:
            min_value = return_min_key(alist, parent, left, -1, 0)
        else:
            min_value = return_min_key(alist, parent, left, -1, 1)

    else:

        if (alist[parent] == alist[left] == alist[right]) == True:
            min_value = return_min_key(alist, parent, left, right, 0)
        else:
            min_value = return_min_key(alist, parent, left, right, 1)

    if min_value != parent:
        swap(alist, parent, min_value)
        shift_down(alist, min_value)     


def min_heap(alist):
    # Index 0 element is dummy. minimum element's index is 1
    min = alist[1]
    alist.pop(1)

    # Maintain heap structure
    parent_last_element = parent_key_cal(len(alist)-1)
    for key in reversed(range(1, parent_last_element + 1)):
        shift_down(alist, key)

    return min

def heap_insert(alist, value):
    alist.append(value)
    shift_up(alist, len(alist)-1)

line1 = input().split()
n = int(line1[0])
m = int(line1[1])
jobs = list(map(int, input().split()))
threads = []
for i in range(n):
    threads.append([i, 0])

# Insert dummy element to make heap calculation easier
threads.insert(0,[-1,-1])

record = []
# O(M)
while len(jobs) > 0:
    # Allocate a job to a thread and record it this moment
    # "threads" is min_heap along with time to finish a allocated job. 0 -> thread order, 1 ->  time to finish the job
    next_thread = min_heap(threads)  # O(lgN)
    record.append([next_thread[0], next_thread[1]])  

    # Updated poped thread as much as time to finish the next job
    next_thread[1] += jobs.pop(0) 

    # Insert this into min_heap
    heap_insert(threads, next_thread)

for i in range(len(record)):
    print(str(record[i][0]) + ' ' + str(record[i][1]))

kraskevich · Accepted Answer · 2016-12-16 16:57:44Z

Firstly, I'll recommend to run the solution on the maximum test locally (that is n = 100000 and m = 100000) (yeah, 5000 and 100000 is big test, but do you stop there? Why don't you use the maximum possible test case?).

Secondly, there at least two flaws in your solution:

It increases the time by one instead of jumping to the next event:
```
while len(jobs) > 0:
    if threads[0][1] <= time:
        record.append([threads[0][0], time])
        ...
    else:
        time += 1
```
It requires O(MAX_T) operations. That's too much if the maximum time is 10^9.
jobs.pop(0) might work in O(n) (it depends on the python implementation, but if it works like C++ vector, which is the case for many interpreters), which yields O(n^2) operations in the worst case. That's too much, too.

There might be other slow parts in your solution (I saw these two immediately, so I wrote just about them).

I'd recommend you to redesign the algorithm, prove that it's fast enough (hint: it should be something like O((n + m) log n)) and only after that implement it.

Collectives™ on Stack Overflow

Priority Queue: parallel processing

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related