I have two pandas data-frames that I would like to merge together, but not in the way that I've seen in the examples I've been able to find. I have a set of "old" data and a set of "new" data that for two data frames that are equal in shape with the same column names. I do some analysis and determine that I need to create third dataset, taking some of the columns from the "old" data and some from the "new" data. As an example, lets say I have these two datasets:
df_old = pd.DataFrame(np.zeros([5,5]),columns=list('ABCDE'))
df_new = pd.DataFrame(np.ones([5,5]),columns=list('ABCDE'))
which are simply:
A B C D E
0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0
4 0.0 0.0 0.0 0.0 0.0
and
A B C D E
0 1.0 1.0 1.0 1.0 1.0
1 1.0 1.0 1.0 1.0 1.0
2 1.0 1.0 1.0 1.0 1.0
3 1.0 1.0 1.0 1.0 1.0
4 1.0 1.0 1.0 1.0 1.0
I do some analysis and find that I want to replace columns B and D. I can do that in a loop like this:
replace = dict(A=False,B=True,C=False,D=True,E=False)
df = pd.DataFrame({})
for k,v in sorted(replace.items()):
df[k] = df_new[k] if v else df_old[k]
This gives me the data that I want:
A B C D E
0 0.0 1.0 0.0 1.0 0.0
1 0.0 1.0 0.0 1.0 0.0
2 0.0 1.0 0.0 1.0 0.0
3 0.0 1.0 0.0 1.0 0.0
4 0.0 1.0 0.0 1.0 0.0
but, this honestly seems a bit clunky, and I'd imagine that there is a better way to use pandas to do this. Plus, I'd like to preserve the order of my columns which may not be in alphabetical order like this example dataset, so sorting the dictionary may not be the way to go, although I could probably pull the column names from the data set if need be.
Is there a better way to do this using some of Pandas merge functionality?