1

I have a code as follows

GU_ES3['Qnd'] = GU_ES3['Qnd,hw_m2'] * GU_ES3['UPOR']

What I found out is, that some values in GU_ES3['UPOR'] are blank. And in this scenario, I'd like to call some other column, e.g. GU_ES3['NPOR'], but only in this case when values in column GU_ES3['UPOR'] are 0 or NaN.

Can you help me?

2
  • 1
    Is this pandas? Or just a dictionary? Commented May 14, 2018 at 19:07
  • Pandas. I'll edit the original question. Commented May 14, 2018 at 19:10

2 Answers 2

3

You want to

  1. values from one DataFrame are replaced by the other under a certain situation
  2. ensure that 0 or NaN are treated equally

For (2), you can replaces 0s with NaNs, and for (1) you can use pd.Series.fillna to fill NaNs in UPOR with NPOR, but only where it is NaN.

i = GU_ES3['Qnd,hw_m2']
j = GU_ES3['UPOR'].replace(0, np.nan).fillna(GU_ES3['NPOR'])
GU_ES3['Qnd'] = i * j 

Alternatively, you may use np.where to perform replacement:

GU_ES3['Qnd'] = GU_ES3['Qnd,hw_m2'] * np.where(
    GU_ES3['UPOR'].replace({0 : np.nan}).isna(), GU_ES3['NPOR'], GU_ES3['UPOR']
)

Note that with replace, if, for example, you want to also replace 1, 2, or 3, you would simply need to use .replace(dict.fromkeys([1, 2, 3], np.nan)) in your code.

Sign up to request clarification or add additional context in comments.

5 Comments

this is a clean looking solution, but wouldn't it just be easier to use a where statement (either df.where or np.where) and check for 0 or np.nan? And just set the values from the where statement to the values of another column?
@GrantWilliams They are just two ways of doing the same thing. if you don't replace, you will need two conditions. If you want to treat more than two things the same way, you will need more. It's best to replace first and forget later.
I definitely think your solution is the most interesting and looks the most elegant, but i'd be curious if there were any big performance difference between them for large datasets. I'd assume pandas is able to use logical indexing with a where type statement
@GrantWilliams Sure, there might be. Neither replace nor fillna are known for blazing performance. The best course of action is for OP to test these solutions on their data and figure out which works the best. Btw, I've added a np.where alternative myself.
I definitely like your solution, and dont want it to see like im saying its not as good or anything. I just use any chance i can on these types of questions to learn some of the trade offs on techniques. Figured its a great chance to learn
3

Try using pd.Series.where:

GU_ES3['Qnd'] = GU_ES3['Qnd,hw_m2'] * GU_ES3['UPOR'].where(~GU_ES3['UPOR'].isnull() & (GU_ES3['UPOR'] != 0), other=GU_ES3['NPOR'])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.