I have two Pandas Dataframe and I want to be able to create the Result column (yellow) by matching column1 values in first df with the appropriate values from the second df. Column1 values in df one are all references to columns names in the second df.
-
Try to pivot your first data frame to get same structure as the second data frame. Once you've achieved this, you can do a simple merge.pansen– pansen2017-03-07 20:45:43 +00:00Commented Mar 7, 2017 at 20:45
-
@pansen I thought about that but I want to keep the format of the first df, I'm now trying to pivot the second dfobabs– obabs2017-03-07 20:51:07 +00:00Commented Mar 7, 2017 at 20:51
-
You can create a temporary df for merging only while keeping your original df. However, reshaping your second df is totally fine as well. In this case, melt is your friend :-).pansen– pansen2017-03-07 20:54:55 +00:00Commented Mar 7, 2017 at 20:54
-
@pansen amazing melt is exactly what I was looking for thanksobabs– obabs2017-03-07 20:59:39 +00:00Commented Mar 7, 2017 at 20:59
Add a comment
|
2 Answers
As mentioned in the comments, melt and merge are a good approach here:
import pandas as pd
df1 = pd.DataFrame({'Date': ['12/30/2016', '12/30/2016', '1/31/2017', '1/31/2017'],
'Column1': ['APB', 'UPB', 'APB', 'UPB']})
df2 = pd.DataFrame({'Date': ['12/30/2016', '1/31/2017', '2/28/2017', '3/31/2017'],
'APB': [117, 112.8, 112.37, 112.23],
'UPB': [67.925, 67.865, 66.717, 66.939]})
melted = pd.melt(df2, id_vars="Date", var_name="Column1", value_name="Result")
merged = df1.merge(melted, on=["Date", "Column1"])
print(merged)
Column1 Date Result
0 APB 12/30/2016 117.000
1 UPB 12/30/2016 67.925
2 APB 1/31/2017 112.800
3 UPB 1/31/2017 67.865
Comments
You can use melt and set_index on your df2
df1 = pd.DataFrame({'Date': ['12/30/2016', '12/30/2016', '1/31/2017', '1/31/2017'], 'col1': ['APB', 'UPB', 'APB', 'UPB']})
df2 = pd.DataFrame({'Date': ['12/30/2016', '1/31/2017', '2/28/2017', '3/31/2017'], 'APB': [117, 112.8, 112.37, 112.23], 'UPB': [67.9, 67.8, 66.7, 66.9]})
df2 = pd.melt(df2, id_vars='Date', value_vars=['APB', 'UPB'])
df2['Date'] = pd.to_datetime(df2['Date'])
df2.sort_values(by = 'Date').set_index('Date')
This gives you
variable value
Date
2016-12-30 APB 117.00
2016-12-30 UPB 67.90
2017-01-31 APB 112.80
2017-01-31 UPB 67.80
2017-02-28 APB 112.37
Now you can merge the two dataframes,
df1 = df1.merge(df2, left_on = 'col1', right_on = 'variable').drop_duplicates().drop('variable', axis = 1).sort_values(by = 'Date')
That gives you
col1 Date value
0 APB 2016-12-30 117.00
8 UPB 2016-12-30 67.90
1 APB 2017-01-31 112.80
9 UPB 2017-01-31 67.80
2 APB 2017-02-28 112.37
10 UPB 2017-02-28 66.70
3 APB 2017-03-31 112.23
11 UPB 2017-03-31 66.90

