1

I have a dataframe like this:

        id1     name    id2   val 
0       1        'A'     1     4
1       1        'B'     1     1
2       2        'C'     3     1
. 
.
.

I have another dataframe that is as follows:

              new_val 
  1              2 
  3              4 

I want to make the first dataframe as follows:

        id1     name    id2   val 
0       1        'A'     1     2.0
1       1        'B'     1     0.5
2       2        'C'     3     0.25
. 
.
.

What I want to do is divide the val column in the first dataframe with the value that matches the index to column id2. We see that id2 = 1 then we divide val = 4 by 2 since it corresponds to index 1. id2 = 3 then we divide val=1 by 4 to get 0.25.

I know I could add these into lists of tuples and perform the computation and reset the column, but is this possible with pandas functions? Using for loops for really large datasets would be really computationally expensive.

3 Answers 3

3

Hmm, this way might be less space efficient, but it should be faster than looping:

>>> df1
   id1 name  id2  val
0    1  'A'    1    4
1    1  'B'    1    1
2    2  'C'    3    1
>>> df2 = pd.DataFrame([2,4], index=[1,3])
>>> df2
   0
1  2
3  4

So, start by setting an index:

>>> df1.set_index('id2', inplace=True)

Then, using df2 which I assume is indexed properly:

>>> df1['divisor'] = df2
>>> df1
     id1 name  val  divisor
id2
1      1  'A'    4        2
1      1  'B'    1        2
3      2  'C'    1        4
>>> df1.val / df1.divisor
id2
1    2.00
1    0.50
3    0.25
dtype: float64

And finally, just to be complete:

>>> df1['val'] = df1.val / df1.divisor
>>> df1
     id1 name   val  divisor
id2
1      1  'A'  2.00        2
1      1  'B'  0.50        2
3      2  'C'  0.25        4
>>> df1.drop('divisor',inplace=True, axis=1)
>>> df1
     id1 name   val
id2
1      1  'A'  2.00
1      1  'B'  0.50
3      2  'C'  0.25
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks this works much better than what I originally did
3

Using map and /=

df1.val /= df1.id2.map(df2.new_val)
print(df1)

   id1 name  id2   val
0    1  'A'    1  2.00
1    1  'B'    1  0.50
2    2  'C'    3  0.25

Comments

2

There are a number of ways you can do this. You can first tack on the 'new_val' column from the second DataFrame to the first and then manipulate the columns from there.

df.join(df2, on='id2')

Which produces:

   id1 name  id2  val  new_val
0    1  'A'    1    4        2
1    1  'B'    1    1        2
2    2  'C'    3    1        4

And then operate on the columns

df_final['val'] = df_final['val'] / df_final['new_val']
df_final.drop('new_val', axis=1, inplace=True)

   id1 name  id2   val
0    1  'A'    1  2.00
1    1  'B'    1  0.50
2    2  'C'    3  0.25

And some one liners

df.assign(val=lambda x: (x.set_index('id2')['val'] / df2['new_val']).values)

df.set_index('id2', drop=False).assign(val=lambda x: x['val'] / df2['new_val']).reset_index(drop=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.