Multiprocessing in python loop

Question

I am generating negative pairs with the help of positive pairs. I would like to speed up the process by using all core of the CPU. On a single CPU core, it takes almost five days including day and night.

I tend to change the below code in multiprocessing. Meanwhile, I have no list of "positives_negatives.csv"

if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    for combo in tqdm(itertools.combinations(identities.values(), 2), desc="Negatives"):
        for cross_sample in itertools.product(combo[0], combo[1]):
            negatives = negatives.append(pd.Series({"file_x": cross_sample[0], "file_y": cross_sample[1]}).T,
                                         ignore_index=True)
    negatives["decision"] = "No"
    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)

Modified code

def multi_func(iden, negatives):
    for combo in tqdm(itertools.combinations(iden.values(), 2), desc="Negatives"):
        for cross_sample in itertools.product(combo[0], combo[1]):
            negatives = negatives.append(pd.Series({"file_x": cross_sample[0], "file_y": cross_sample[1]}).T,
                                         ignore_index=True)

Used

if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    with concurrent.futures.ProcessPoolExecutor() as executor:
        secs = [5, 4, 3, 2, 1]
        results = executor.map(multi_func(identities, negatives), secs)

    negatives["decision"] = "No"
    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)

Your best bet would be to break up the the work into subgroups, then use multiprocessing from there. — goalie1998
– goalie1998, Commented Jan 30, 2021 at 4:59
If possible for you please give me an example related to the "else" clause — Khawar Islam
– Khawar Islam, Commented Jan 30, 2021 at 5:02

Khawar Islam · Accepted Answer · 2021-01-31 13:43:42Z

1

The best way is to implement Process Pool Executor class and create a separate function. Like you can achieve in this way

Libraries

from concurrent.futures.process import ProcessPoolExecutor
import more_itertools
from os import cpu_count

def compute_cross_samples(x):
    return pd.DataFrame(itertools.product(*x), columns=["file_x", "file_y"])

Modified code

if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    with ProcessPoolExecutor() as pool:
        # take cpu_count combinations from identities.values
        for combos in tqdm(more_itertools.ichunked(itertools.combinations(identities.values(), 2), cpu_count())):
            # for each combination iterator that comes out, calculate the cross
            for cross_samples in pool.map(compute_cross_samples, combos):
                # for each product iterator "cross_samples", iterate over its values and append them to negatives
                negatives = negatives.append(cross_samples)

    negatives["decision"] = "No"

    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)

answered Jan 31, 2021 at 13:43

Khawar Islam

2,9842 gold badges41 silver badges68 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

ti7 Over a year ago

Can you add some more about this? ie. what sort of speedup did you see?

ti7 Over a year ago

Additionally, you should be able to mark your own answer as the Answer with the check to its left after 2-ish days!

Khawar Islam Over a year ago

@ti7 Yes. Later, I will add more details

Collectives™ on Stack Overflow

Multiprocessing in python loop

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related