Combine Pandas DataFrames when using multiprocessing

Question

I am using multiprocessing, and generating a pandas DataFrame with each process. I would like to merge them together and output the data. The following strategy seems almost work, but when trying to read in the data with df.read_csv() it only uses the first name as a column header.

from multiprocessing import Process, Lock

def foo(name, lock):
    d = {f'{name}': [1, 2]}
    df = pd.DataFrame(data=d)

    lock.acquire()
    try:
        df.to_csv('output.txt', mode='a')
    finally:
        lock.release()

if __name__ == '__main__':
    lock = Lock()

    for name in ['bob','steve']
        p = Process(target=foo, args=(name, lock))
        p.start()
    p.join()

Were you expecting the columns to be concatenated horizontally? CSV files don't do that. You might consider using a multiprocessing.Queue to pass your end result back to the originating process, and leave the master process in charge of combining things. — Tim Roberts
– Tim Roberts, Commented Oct 22, 2021 at 20:20
@TimRoberts that is a great solution, then i can just combine the dataframes and write out at the same time, makes sense. — The Nightman
– The Nightman, Commented Oct 22, 2021 at 20:22

Corralien · Accepted Answer · 2021-10-22 20:33:33Z

6

You can use multiprocessing.Pool:

import multiprocessing
import pandas as pd

def foo(name):
    d = {f'{name}': [1, 2]}
    df = pd.DataFrame(data=d)
    return df

if __name__ == '__main__':
    data = ['bob', 'steve']
    with multiprocessing.Pool(2) as pool:
        data = pool.map(foo, data)
    pd.concat(data, axis=1).to_csv('output.csv')

Output:

>>> pd.concat(data, axis=1)
   bob  steve
0    1      1
1    2      2

edited Oct 22, 2021 at 20:33

answered Oct 22, 2021 at 20:28

Corralien

121k8 gold badges44 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Combine Pandas DataFrames when using multiprocessing

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related