Try this code, using multiprocessing:
import multiprocessing
def f(x):
return x*x
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in range(0, len(l), n):
yield l[i:i + n]
if __name__ == '__main__':
n_core = multiprocessing.cpu_count()
p = multiprocessing.Pool(processes= n_core)
data = range(0, 8)
subsets = chunks(data, n_core)
subset_results = []
for subset in subsets:
subset_results.append(p.map(f, subset))
print(subset_results)
In your case, a chunks function that could do for you is the following:
def chunks_series(s):
subsets = []
for i in range(s.max() + 1):
subset = s[s == i]
subsets.append(subset.values)
return subsets
subsets = chunks_series(df['col1'])
Or you can do everything in the same loop:
n_core = multiprocessing.cpu_count()
p = multiprocessing.Pool(processes=n_core)
s = df['col1']
subset_results = []
for i in range(s.max() + 1):
subset = s[s == i]
subset_results.append(p.map(f, subset))
I had preferred to introduce a chunk function, even if for your case it does not introduce advantages, to make the code more clear and generalizable.