5

I have two dataframes that look like this:

df1=
   A   B   
1  A1  B1
2  A2  B2
3  A3  B3

df2 = 
   A   C
4  A4  C4
5  A5  C5

I would like to append df2 to df1, like so:

   A   B   
1  A1  B1
2  A2  B2
3  A3  B3
4  A4  NaN
5  A5  NaN

(Note: I've edited the dataframes so that not all the columns in df1 are necessarily in df2)

Whether I use concat or append, the resulting dataframe I get would have a column called "C" with the first three rows filled with nan. I just want to keep the two original columns in df1, with the new values appended. Is there a way concatenate the dataframes without having to drop the extra column afterwards?

2
  • 1
    Can you edit to show what you want the final dataframe to look like, given the above example? I'm having a hard time visualizing it. Commented Jul 5, 2016 at 14:00
  • Sorry. I've edited the question. Commented Jul 5, 2016 at 14:02

2 Answers 2

3

You can first filter columns for appending by subset:

print (df2[['A']])
    A
4  A4
5  A5

print (pd.concat([df1, df2[['A']]]))
    A    B
1  A1   B1
2  A2   B2
3  A3   B3
4  A4  NaN
5  A5  NaN

print (df1.append(df2[['A']]))
    A    B
1  A1   B1
2  A2   B2
3  A3   B3
4  A4  NaN
5  A5  NaN

print (df2[['A','B']])
    A   B
4  A4  B4
5  A5  B5

print (pd.concat([df1, df2[['A','B']]]))
    A   B
1  A1  B1
2  A2  B2
3  A3  B3
4  A4  B4
5  A5  B5

Or:

print (df1.append(df2[['A','B']]))
    A   B
1  A1  B1
2  A2  B2
3  A3  B3
4  A4  B4
5  A5  B5

EDIT by comment:

If columns in df1 and df2 have different columns, use intersection:

print (df1)
    A   B  D
1  A1  B1  R
2  A2  B2  T
3  A3  B3  E

print (df2)
    A   B   C
4  A4  B4  C4
5  A5  B5  C5

print (df1.columns.intersection(df2.columns))
Index(['A', 'B'], dtype='object')

print (pd.concat([df1, df2[df1.columns.intersection(df2.columns)]]))
    A   B    D
1  A1  B1    R
2  A2  B2    T
3  A3  B3    E
4  A4  B4  NaN
5  A5  B5  NaN
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. That works for the original way I asked the question. What should I do if not all the columns of df1 are in df2?
Then you get NaN in columns which does not match like (df1.append(df2)). Do you need append other way?
The result you got looks perfectly good, but then you would have to do this for every column that's in df1 but not df2.
If all columns of df1 are in df2, use print (pd.concat([df1, df2[df1.columns]]))
If not, you can find common columns by intersection print (df1.columns.intersection(df2.columns))
0

Actually the solution is in an obscure corner of this page. Here's the code to use:

pd.concat([df1,df2],join_axes=[df1.columns])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.