How to combine multiple dataframes using for loop?

Question

I am trying to merge multiple columns where after one column the following column starts in a specific index. for example, as you can see in the code below, I have 15 sets of data from df20 to df90. As seen in the code, I have merge the data i and then followed by another starting from index = 1,000.

So I wanted my output to be df20 followed by df25 starting at index=1000, then followed by df30 starting at index=2000, then followed by df35 at index=3000. I wanted to see all 15 columns but I only have one column in my output.

I have tried it below, but doesn't seem to work. Please help.

dframe = [df20, df25, df30, df35, df40, df45, df50, df55, df60, df65, df70, df75, df80, df85, df90]
for i in dframe:
  a = i.merge((i).set_index((i).index+1000), how='outer', left_index=True, right_index=True)

print(a)

Output:

                      df90_x              df90_y
0                     0.000757                      NaN
1                     0.001435                      NaN
2                     0.002011                      NaN
3                     0.002497                      NaN
4                     0.001723                      NaN
...                        ...                      ...
10995                      NaN             1.223000e-12
10996                      NaN             1.305000e-12
10997                      NaN             1.809000e-12
10998                      NaN             2.075000e-12
10999                      NaN             2.668000e-12

[11000 rows x 2 columns]

Expected Output:

                      df20                 df25                  df30
0                     0.000757             0                     0
1                     0.001435             0                     0
2                     0.002011             0                     0
3                     0.002497             0                     0
4                     0.001723             0                     0
...                  ...                   ...                   ...
1000                                      1.223000e-12           0
1001                                      1.305000e-12           0
1002                                      1.809000e-12           0
1003                                      2.668000e-12           0
...                                                              ...
2000                                                             0.1234
2001                                                             0.4567
2002                                                             0.8901
2003                                                             0.2345

That is doing what merge is expected to do. Try pd.concat(dframe, axis=1) — ThePyGuy
– ThePyGuy, Commented Aug 12, 2021 at 6:17
what would you like the output to be? Why is the output you got wrong? Explaining this would help answer a lot of questions I have about the format of your data. — Marijn van Vliet
– Marijn van Vliet, Commented Aug 12, 2021 at 6:22
@MarijnvanVliet So I wanted my output to be df20 followed by df25 starting at index=1000, then followed by df30 starting at index=2000, then followed by df35 at index=3000. — Kim Yejun
– Kim Yejun, Commented Aug 12, 2021 at 6:34
@ThePyGuy I wanted to try javing my output to be df20 followed by df25 starting at index=1000, then followed by df30 starting at index=2000, then followed by df35 at index=3000. — Kim Yejun
– Kim Yejun, Commented Aug 12, 2021 at 6:35
If it's only the index you are concerned about, you can use pd.concat. Post a small sample from the dataframes, and also add the expected output for the sample data. Please take a look at How to ask and How to make good pandas example — ThePyGuy
– ThePyGuy, Commented Aug 12, 2021 at 6:37

Mahdi F. · Accepted Answer · 2021-08-12 09:27:07Z

1

you can try this code, if you want variable for num_dataframe , length_dataframe:

import pandas as pd
import random

dframe = list()
num_dataframe = 3
len_dataframe = 5

for i in range((num_dataframe)):
    dframe.append(pd.DataFrame({i:[random.randrange(1, 50, 1) for i in range(len_dataframe)]},
                               index=range(i*len_dataframe, (i+1)*len_dataframe)))


result = pd.concat([dframe[i] for i in range(num_dataframe)], axis=1)

result.fillna(0)

output:

and for your question, you want 20 data frame with 1000 length, you can try this:

import pandas as pd
import random

dframe = list()
num_dataframe = 20
len_dataframe = 1000

for i in range((num_dataframe)):
    dframe.append(pd.DataFrame({i:[np.random.random() for i in range(len_dataframe)]},
                               index=range(i*len_dataframe, (i+1)*len_dataframe)))


result = pd.concat([dframe[i] for i in range(num_dataframe)], axis=1)

result.fillna(0)

output:

as you mentioned in the comment, I edit the post and add this code:

dframe = [df20, df25, df30, df35, df40, df45, df50, df55, df60, df65, df70, df75, df80, df85, df90]

result = pd.concat([dframe[i] for i in range(len(dframe))], axis=0)

result.fillna(0)

edited Aug 12, 2021 at 9:27

answered Aug 12, 2021 at 7:53

Mahdi F.

24.1k5 gold badges25 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Kim Yejun Over a year ago

So how do I exactly input in my dataframes? I see those are only random numbers. can you enlighten me on this part please?

Mahdi F. Over a year ago

@KimYejun, I edit the post and add code as you request, maybe this helps you.

Kim Yejun Over a year ago

I have tried your code but the data inside the dataframes were not shown. everything is just zero :(

Mahdi F. Over a year ago

@KimYejun, I send three code blocks, which code block did you run?

Kim Yejun Over a year ago

Yes I did run everything. First code is the one I wanted my output to be. However I the values are random. So you gave me the 3rd code, which gives me the values of my dataframes from df20 to df90, but when I ran it, it only shows zeros not the actual values in my data frame :( I'm sorry, maybe I'm just really not good with coding

|

Baron Legendre · Accepted Answer · 2021-09-02 17:34:19Z

1

please refer to official page.

Concat multiple dataframes

df1=pd.DataFrame(
        {
            "A":["A0","A1","A2","A3"]
        },
        index=[0, 1, 2, 3]
)
df2=pd.DataFrame(
        {
            "B":["B4","B5"]
        },
        index=[4, 5]
)
df3=pd.DataFrame(
        {
            "C":["C6", "C7", "C8", "C9", "C10"]
        },
        index=[6, 7, 8, 9, 10]
)
result = pd.concat([df1, df2, df3], axis=1)
display(result)

Output:

      A    B    C
0    A0  NaN  NaN
1    A1  NaN  NaN
2    A2  NaN  NaN
3    A3  NaN  NaN
4   NaN   B4  NaN
5   NaN   B5  NaN
6   NaN  NaN   C6
7   NaN  NaN   C7
8   NaN  NaN   C8
9   NaN  NaN   C9
10  NaN  NaN  C10

Import file into a list via looping

method 1: you can create a list to put whole filenames into a list

filenames = ['sample_20.csv', 'sample_25.csv', 'sample_30.csv', ...]
dataframes = [pd.read_csv(f) for f in filenames]

method 1-1: If you do have lots of files then you need a faster way to create the name list

filenames = ['sample_{}.csv'.format(i) for i in range(20, 90, 5)]
dataframes = [pd.read_csv(f) for f in filenames]

method 2:

from glob import glob
filenames = glob('sample*.csv')
dataframes = [pd.read_csv(f) for f in filenames]

edited Sep 2, 2021 at 17:34

answered Aug 12, 2021 at 7:02

Baron Legendre

2,1883 gold badges8 silver badges23 bronze badges

5 Comments

Kim Yejun Over a year ago

Thank you very much for this answer. Actually I have tried it in this similar way however I have a lot of dataframes, around a thousand or more to concat so I was trying to figure out how to do it using a for loop function instead.

Baron Legendre Over a year ago

I updated the post about the list looping of dataframes

Baron Legendre Over a year ago

Perhaps you need to use this way to arrange the index respectively, make sure there won't be any overlapping index, then doing the "concat whole" process

Kim Yejun Over a year ago

Sorry, but can you enlighten me with the list looping?

Baron Legendre Over a year ago

Updated looping file into a list, after this step, you might need to deal with index in every single dataframe from the dataframes list

Collectives™ on Stack Overflow

How to combine multiple dataframes using for loop?

2 Answers 2

6 Comments

Concat multiple dataframes

Import file into a list via looping

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Concat multiple dataframes

Import file into a list via looping

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related