0

I have problems while combining a list of DataFrames in Python. First of all I got a unknown number amount of DataFrames which are stored during a for-loop as follows:

appendDataFrames.append(df)

These DataFrames have 5 columns that are always the same: |static_1|static_2|static_3|static_4|static_5|... after those 5 columns there can be a set of columns between 5 up to 400 columns. I don't know beforehand about the column-set and their naming, but it can happen that some columns have equal names over the hole set of DataFrames.

Now I want to create a overall DataFrame which contains these 5 static columns and afterwards all other columns. If values for a column are not present in one sub-DataFrame it should contain NaN e.g.

Such as follows:

>>> import pandas as pd
>>> pd.__version__
u'0.17.1'
>>> df1 = pd.DataFrame([[1, 2, 1, 2, 1, 8.59, 7.64], [1, 2, 1, 2, 1, 10.5, 17.64]],
...                    columns=['static_1', 'static_2', 'static_3', 'static_4', 'static_5',
...                    'c1', 'c2'])
>>> df1
   static_1  static_2  static_3  static_4  static_5     c1     c2
0         1         2         1         2         1   8.59   7.64
1         1         2         1         2         1  10.50  17.64
>>> df2 = pd.DataFrame([[3, 4, 3, 4, 3, 100.56, 1.58], [3, 4, 3, 4, 3, 0.50, 1.68]],
...                    columns=['static_1', 'static_2', 'static_3', 'static_4', 'static_5',
...                    'c1', 'c3'])
>>> df2
   static_1  static_2  static_3  static_4  static_5      c1    c3
0         3         4         3         4         3  100.56  1.58
1         3         4         3         4         3    0.50  1.68

Now I want to merge, concat, append, join or whatever to get a superset of all combined DataFrame resultDf like:

>>> resultDf
   static_1  static_2  static_3  static_4  static_5      c1     c2    c3
0         1         2         1         2         1    8.59   7.64   NaN
1         1         2         1         2         1   10.50  17.64   NaN
2         3         4         3         4         3  100.56    NaN  1.58
3         3         4         3         4         3    0.50    NaN  1.68

Thanks in advance!

0

2 Answers 2

1

For full&clean answer with reference to the index of the final df:

pd.concat([df1,df2], ignore_index=False)

Result: enter image description here

You can see the problem with the index column.

For resolving this issue:

pd.concat([df1,df2], ignore_index=True)

Final result: final result df

Sign up to request clarification or add additional context in comments.

Comments

0

You just got to use pd.concat() like

>>> dataframes = []
>>> dataframes.append(df1)
>>> dataframes.append(df2)
>>> dataframes
[   static_1  static_2  static_3  static_4  static_5     c1     c2
0         1         2         1         2         1   8.59   7.64
1         1         2         1         2         1  10.50  17.64,    static_1  static_2  static_3  static_4  static_5      c1    c3
0         3         4         3         4         3  100.56  1.58
1         3         4         3         4         3    0.50  1.68]
>>> dataframes[0]
   static_1  static_2  static_3  static_4  static_5     c1     c2
0         1         2         1         2         1   8.59   7.64
1         1         2         1         2         1  10.50  17.64
>>> dataframes[1]
   static_1  static_2  static_3  static_4  static_5      c1    c3
0         3         4         3         4         3  100.56  1.58
1         3         4         3         4         3    0.50  1.68
>>> result = pd.concat(dataframes, ignore_index=True)
>>> result
       c1     c2    c3  static_1  static_2  static_3  static_4  static_5
0    8.59   7.64   NaN         1         2         1         2         1
1   10.50  17.64   NaN         1         2         1         2         1
2  100.56    NaN  1.58         3         4         3         4         3
3    0.50    NaN  1.68         3         4         3         4         3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.