0

I have two Pandas Dataframe and I want to be able to create the Result column (yellow) by matching column1 values in first df with the appropriate values from the second df. Column1 values in df one are all references to columns names in the second df.

enter image description here

enter image description here

4
  • Try to pivot your first data frame to get same structure as the second data frame. Once you've achieved this, you can do a simple merge. Commented Mar 7, 2017 at 20:45
  • @pansen I thought about that but I want to keep the format of the first df, I'm now trying to pivot the second df Commented Mar 7, 2017 at 20:51
  • You can create a temporary df for merging only while keeping your original df. However, reshaping your second df is totally fine as well. In this case, melt is your friend :-). Commented Mar 7, 2017 at 20:54
  • @pansen amazing melt is exactly what I was looking for thanks Commented Mar 7, 2017 at 20:59

2 Answers 2

1

As mentioned in the comments, melt and merge are a good approach here:

import pandas as pd

df1 = pd.DataFrame({'Date': ['12/30/2016', '12/30/2016', '1/31/2017', '1/31/2017'], 
                    'Column1': ['APB', 'UPB', 'APB', 'UPB']})

df2 = pd.DataFrame({'Date': ['12/30/2016', '1/31/2017', '2/28/2017', '3/31/2017'], 
                   'APB': [117, 112.8, 112.37, 112.23], 
                   'UPB': [67.925, 67.865, 66.717, 66.939]})

melted = pd.melt(df2, id_vars="Date", var_name="Column1", value_name="Result")
merged = df1.merge(melted, on=["Date", "Column1"])

print(merged)

  Column1        Date   Result
0     APB  12/30/2016  117.000
1     UPB  12/30/2016   67.925
2     APB   1/31/2017  112.800
3     UPB   1/31/2017   67.865
Sign up to request clarification or add additional context in comments.

Comments

1

You can use melt and set_index on your df2

df1 = pd.DataFrame({'Date': ['12/30/2016', '12/30/2016', '1/31/2017', '1/31/2017'], 'col1': ['APB', 'UPB', 'APB', 'UPB']})
df2 = pd.DataFrame({'Date': ['12/30/2016', '1/31/2017', '2/28/2017', '3/31/2017'], 'APB': [117, 112.8, 112.37, 112.23], 'UPB': [67.9, 67.8, 66.7, 66.9]})



df2 = pd.melt(df2, id_vars='Date', value_vars=['APB', 'UPB'])
df2['Date'] = pd.to_datetime(df2['Date'])
df2.sort_values(by = 'Date').set_index('Date')

This gives you

    variable    value
Date        
2016-12-30  APB 117.00
2016-12-30  UPB 67.90
2017-01-31  APB 112.80
2017-01-31  UPB 67.80
2017-02-28  APB 112.37

Now you can merge the two dataframes,

df1 = df1.merge(df2, left_on = 'col1', right_on = 'variable').drop_duplicates().drop('variable', axis = 1).sort_values(by = 'Date')

That gives you

    col1    Date    value
0   APB 2016-12-30  117.00
8   UPB 2016-12-30  67.90
1   APB 2017-01-31  112.80
9   UPB 2017-01-31  67.80
2   APB 2017-02-28  112.37
10  UPB 2017-02-28  66.70
3   APB 2017-03-31  112.23
11  UPB 2017-03-31  66.90

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.