0

I have a dataframe df_1 like below :

|  A       |  B        |
+----------+-----------+
|A         |  120.0    |
|D         |  2.50     |
|N         |  1.00     |
|N         |  0.50     |
|D         |  1.50     |
|A         |  240.0    |
+----------+-----------+

And I have another dataframe df_2 like below :

|  J        |  K       |  L        | M         |
+-----------+----------+-----------+-----------+
|  0.50     |   4.1    |   70.0    |  55.0     |
|  0.75     |   6.7    |   80.0    |  66.0     |
|  1.00     |   8.2    |   90.0    |  95.0     |
|  1.50     |   9.6    |   120.0   |  186.0    |
|  2.50     |   10.7   |   240.0   |  188.0    |
+-----------+----------+-----------+-----------+

The goal is to create a new column called Z in the df_1 where in we lookup the values of column B (df_1) in df_2 based on the condition in column A (df_1).For example, in column Z as show below , the value of A is 186.0 which is obtained by looking up the value 120.0 (in df_1) in the columns L and M in df_2.Similarly, the values for D,N are obtained by looking in the columns J and K respectively. The final dataframe looks like below

|  A       |  B        |  Z      |
+----------+-----------+----------
|A         |  120.0    | 186.0   |
|D         |  2.50     | 10.7    |
|N         |  1.00     | 8.2     |
|N         |  0.50     | 4.1     |
|D         |  1.50     | 9.6     |
|A         |  240.0    | 188.0   |
+----------+-----------+---------+

So, how can the above be achieved in Pandas similar to how we can achieve in Excel vlookup ?

EDIT : What if the df_2 looked like below ? Can we still get the same results as above ?

|  J        |  K       |  L        | M         |
+-----------+----------+-----------+-----------+
|  0.50     |   4.1    |   70.0    |  55.0     |
|  0.75     |   6.7    |   80.0    |  66.0     |
|  1.00     |   8.2    |   90.0    |  95.0     |
|  1.50     |   9.6    |   120.0   |  186.0    |
|  2.50     |   10.7   |   240.0   |  188.0    |
|  3.50     |   2.2    |   NaN     |  NaN      |
+-----------+----------+-----------+-----------+
3
  • 1
    In your final_dataframe - Why the values in column B is different from the original? Commented Jun 28, 2021 at 17:22
  • I have corrected it. It's a typo Commented Jun 28, 2021 at 17:24
  • It's corrected now Commented Jun 28, 2021 at 17:29

1 Answer 1

1

IIUC, you can use:

df1['c'] = df1.B.map(dict(df2.values.reshape(-1,2)))
OUTPUT:
            A      B      c
1  A           120.0  186.0
2  D             2.5   10.7
3  N             1.0    8.2
4  N             0.5    4.1
5  D             1.5    9.6
6  A           240.0  188.0

Complete example:

df1 = pd.DataFrame({'A': {1: 'A', 2: 'D', 3: 'N', 4: 'N', 5: 'D', 6: 'A'},
                    'B': {1: 120.0, 2: 2.5, 3: 1.0, 4: 0.5, 5: 1.5, 6: 240.0}})

df2 = pd.DataFrame({'J': {0: 0.5, 1: 0.75, 2: 1.0, 3: 1.5, 4: 2.5},
                    'K': {0: 4.1, 1: 6.7, 2: 8.2, 3: 9.6, 4: 10.7},
                    'L': {0: 70.0, 1: 80.0, 2: 90.0, 3: 120.0, 4: 240.0},
                    'M': {0: 55.0, 1: 66.0, 2: 95.0, 3: 186.0, 4: 188.0}})

df1['c'] = df1.B.map(dict(df2.values.reshape(-1, 2)))
Sign up to request clarification or add additional context in comments.

1 Comment

Hey, it worked - what if the df2 looked like below and if we were to apply the same operation ? i.e if we had NaN values in last row of L and M and we had values in J and K | J | K | L | M | +--------+-------+--------+--------+ | 0.50 | 4.1 | 70.0 | 55.0 | | 0.75 | 6.7 | 80.0 | 66.0 | | 1.00 | 8.2 | 90.0 | 95.0 | | 1.50 | 9.6 | 120.0 | 186.0 | | 2.50 | 10.7 | 240.0 | 188.0 | | 3.50 | 9.9 | NaN | NaN | +--------+-------+--------+--------+

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.