1

I have several data frames and need to do the same thing to all of them.

I'm currently doing this:

df1=df1.reindex(newindex)
df2=df2.reindex(newindex)
df3=df3.reindex(newindex)
df4=df4.reindex(newindex)

Is there a neater way of doing this?

Maybe something like

df=[df1,df2,df3,df4]

for d in df:
     d=d.reindex(newindex)
1
  • 1
    Your proposed solution seems a good option (though I wouldn't call the list "df"). Or put them in a dictionary with more meaningful names Commented Dec 21, 2018 at 12:47

2 Answers 2

2

Yes, your solution is good only necessary assign to new list of DataFrames by list comprehension:

dfs = [df1,df2,df3,df4]
dfs_new = [d.reindex(newindex) for d in dfs]

Nice solution with unpack like suggest @Joe Halliwell, thank you:

df1, df2, df3, df4 = [d.reindex(newindex) for d in dfs]

Or like suggest @roganjosh is possible create dictionary of DataFrames:

dfs = [df1,df2,df3,df4]
names = ['a','b','c','d']

dfs_new_dict = {name: d.reindex(newindex) for name, d in zip(names, dfs)}

And then select each DataFrame by key:

print (dfs_new_dict['a'])

Sample:

df = pd.DataFrame({'a':[4,5,6]})
df1 = df * 10
df2 = df  + 10
df3 = df - 10
df4 = df / 10
dfs = [df1,df2,df3,df4]
print (dfs)
[    a
0  40
1  50
2  60,     a
0  14
1  15
2  16,    a
0 -6
1 -5
2 -4,      a
0  0.4
1  0.5
2  0.6]

newindex = [2,1,0]
df1, df2, df3, df4 = [d.reindex(newindex) for d in dfs]
print (df1)
print (df2)
print (df3)
print (df4)
    a
2  60
1  50
0  40
    a
2  16
1  15
0  14
   a
2 -4
1 -5
0 -6
     a
2  0.6
1  0.5
0  0.4

Or:

newindex = [2,1,0]
names = ['a','b','c','d']
dfs_new_dict = {name: d.reindex(newindex) for name, d in zip(names, dfs)}

print (dfs_new_dict['a'])
print (dfs_new_dict['b'])
print (dfs_new_dict['c'])
print (dfs_new_dict['d'])
Sign up to request clarification or add additional context in comments.

6 Comments

nice one. Thank you. I'll accept once timer has expired
Note that you can assign back to individual variables via df1, df2, df3, df4 = dfs_new if needed.
@JoeHalliwell - Thank you.
hmmm its not actually working for some reason. No error, just all the df's still have the same original index...
didn't change the "newindex", from my example. Thanks guys
|
1

If you have a lot of large dataframes, you can use multiple threads. I suggest using the pathos module (can be installed using pip install pathos):

from pathos.multiprocessing import ThreadPool

# create a thread pool with the max number of threads
tPool = ThreadPool()

# apply the same function to each df
# the function applies to your list of dataframes
newDFs = tPool.map(lambda df: df.reindex(newIndex),dfs)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.