2

i have a table in pandas df

id   product_1 count
1        100     10
2        200     20
3        100     30
4        400     40
5        500     50
6        200     60
7        100     70

also i have another table in dataframe df2

product    score
100         5
200         10
300         15
400         20
500         25
600         30
700         35

i have to create a new column score in my first df, taking values of score from df2 with respect to product_1.

my final output should be. df =

id   product_1 count  score
1        100     10     5
2        200     20     10
3        100     30     5
4        400     40     20
5        500     50     25
6        200     60     10
7        100     70     5

Any ideas how to achieve it?

1 Answer 1

2

Use map:

df['score'] = df['product_1'].map(df2.set_index('product')['score'].to_dict())
print (df)
   id  product_1  count  score
0   1        100     10      5
1   2        200     20     10
2   3        100     30      5
3   4        400     40     20
4   5        500     50     25
5   6        200     60     10
6   7        100     70      5

Or merge:

df = pd.merge(df,df2, left_on='product_1', right_on='product', how='left')
print (df)
   id  product_1  count  product  score
0   1        100     10      100      5
1   2        200     20      200     10
2   3        100     30      100      5
3   4        400     40      400     20
4   5        500     50      500     25
5   6        200     60      200     10
6   7        100     70      100      5

EDIT by comment:

df['score'] = df['product_1'].map(df2.set_index('product')['score'].to_dict())
df['final_score'] = (df['count'].mul(0.6).div(df.id)).add(df.score.mul(0.4))
print (df)
   id  product_1  count  score  final_score
0   1        100     10      5          8.0
1   2        200     20     10         10.0
2   3        100     30      5          8.0
3   4        400     40     20         14.0
4   5        500     50     25         16.0
5   6        200     60     10         10.0
6   7        100     70      5          8.0
Sign up to request clarification or add additional context in comments.

9 Comments

working. If handling large dataset, which one of the method map or merge would take less time?
also if i want to create one more column 'final_score' i.e (0.6*count/id + 0.4* score ) how do i do that
try df['final_score'] = 0.6*df['count'].div(df.id).add(0.4.mul(df.score))
id = 1 , count = 10, score = 5, it should be (0.6*10/1 + 0.4*5) = 8 but in ur final score it is 7.2
Thank you for accepting! And for verifying solution.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.