4

I have a dataframe A that looks like this

bucket  value   
1       0.001855    
1       0.000120    
2       0.000042    
2       0.001888    

and a dataframe B that looks like this

bucket  num 
1       .5  
2       .3

I want to create a column in A that has all value divided by num in B matched by bucket. How do I do this?

3 Answers 3

3

UPDATE: answers the following question from the comment:

What if A is a multiindex? With ['bucket1','bucket2'] as index but we only care for bucket1?

In [140]: A
Out[140]:
                    value
bucket1 bucket2
1       10       0.001855
        11       0.000120
2       12       0.000042
        13       0.001888

In [141]: B
Out[141]:
   bucket  num
0       1  0.5
1       2  0.3

In [142]: A['new'] = A.value / A.reset_index().iloc[:, 0].map(B.set_index('bucket').num).values

In [143]: A
Out[143]:
                    value       new
bucket1 bucket2
1       10       0.001855  0.003710
        11       0.000120  0.000240
2       12       0.000042  0.000140
        13       0.001888  0.006293

OLD answer:

you can use Series.map() method:

In [61]: A['new'] = A.value.div(A.bucket.map(B.set_index('bucket').num))

In [62]: A
Out[62]:
   bucket     value       new
0       1  0.001855  0.003710
1       1  0.000120  0.000240
2       2  0.000042  0.000140
3       2  0.001888  0.006293

or as a virtual column:

In [60]: A.assign(new=A.value/A.bucket.map(B.set_index('bucket').num))
Out[60]:
   bucket     value       new
0       1  0.001855  0.003710
1       1  0.000120  0.000240
2       2  0.000042  0.000140
3       2  0.001888  0.006293

Explanation:

In [65]: B.set_index('bucket')
Out[65]:
        num
bucket
1       0.5
2       0.3

In [66]: A.bucket.map(B.set_index('bucket').num)
Out[66]:
0    0.5
1    0.5
2    0.3
3    0.3
Name: bucket, dtype: float64
Sign up to request clarification or add additional context in comments.

3 Comments

What if A is a multiindex? With ['bucket1','bucket2'] as index but we only care for bucket1? I am getting No axis named 2 for object type <class 'pandas.core.frame.DataFrame'> as an error
@SuperString, can you post reproducible sample data sets and expected data set? Pleas read how to make good reproducible pandas examples
@SuperString in all fairness, that is the sort of thing you bring up in the question.
1

I'm really just admiring @MaxU's answer and wanted to contribute something.
Here is a numpy answer

A.value /= B.num.values.dot(B.bucket.values[:, None] == A.bucket.values)

A

   bucket     value
0       1  0.003710
1       1  0.000240
2       2  0.000140
3       2  0.006293

Comments

0

Probably not as efficient, but you could also use merge to properly distribute values of num in dfB across dfA and then use element-wise division at the index level. to calculate these values.

dfA['new'] = dfA['value'] / pd.merge(dfA, dfB, on='bucket')['num']

dfA
   bucket     value       new
0       1  0.001855  0.003710
1       1  0.000120  0.000240
2       2  0.000042  0.000140
3       2  0.001888  0.006293

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.