1

I want to make for loop with the condition over columns in panda DataFrame:

import numpy as np  
import pandas as pd


df=pd.DataFrame(pd.read_csv("data.csv"))  
print df  

DWWC1980     DWWC1985   DWWC1990  
16.7140310  16.35661439 15.89201716  
20.9414479  18.00822799 15.73516051  
33.95022337 51.87065104 73.76376497  
144.7000805 136.1462017 130.9143924  
54.9506033  75.03339188 93.22994974  

For loop condition statement:

for i in range (1980,2015,5):

    if   any(df["DWWC"+str(i)] <=18.25)  :

            df['MWTP'+str(i)]=(((10-33)/(5))*(df["DWWC"+str(i)]-5))+10  

    elif any((df["DWWC"+str(i)] >  18.25) &  (df["DWWC"+str(i)] <= 36.5)) :

            df['MWTP'+str(i)]=((10/(df.two-df.three))*(df["DWWC"+str(i)]-df.three))+df.Three

    else :
            df['MWTP'+str(i)]=(((df.Three_value-6)/(df.three-5))*(df["DWWC"+str(i)]-6  

df.to_csv('MWTP1.csv',index='ISO3')

But When I run this code and compare with manual calculation, I found that only the first condition calculation is correct and didn't go for the other conditions. (df.one, df.two, and df.three are other columns.)

  MWTP1980       MWTP1985         MWTP1990  
 25.87096095    30.72758886  37.04060109  
 -77.06996017   20.00112954      95.22533503  
 -290.1012655   -640.6304196    -1068.866556  
 -1845.172654   -1718.865351    -1641.61201  
 -1397.638671   -2171.737373    -2873.130596  
2
  • So what is the output you are getting vs. the one you would want to get? Please provide complete MCVE. Commented Jan 19, 2019 at 7:42
  • If you want to check against all conditions, don't use elif and else; just use multiple if-statements and they will all be checked independently Commented Jan 19, 2019 at 7:44

2 Answers 2

1

You can use numpy.select and for get columns names format:

for i in range (1980,2015,5):
    m1 = df["DWWC{}".format(i)] <=18.25
    #inverted m1 mask by ~
    m2 = ~m1 & (df["DWWC{}".format(i)] <= 36.5)
    a = (((10-33)/(5))*(df["DWWC{}".format(i)]-5))+10 
    b = ((10/(df.two-df.three))*(df["DWWC{}".format(i)]-df.three))+df.Three
    c = (((df.Three_value-6)/(df.three-5))*(df["DWWC{}".format(i)]-6

    df["MWTP{}".format(i)] = np.select([m1,m2],[a,b], default=c)
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, but why in the last line you separated just "c"?
@water77 it is default value, for better readability added to code.
1

I believe your problem is the usage of if elif else as follows:

if any(df["DWWC"+str(i)] <=18.25):
// executes if confidion is true
elif any((df["DWWC"+str(i)] >  18.25) &  (df["DWWC"+str(i)] <= 36.5)):
// executes if first condition is false and second condition is true
else:
// executes if both condition are false

So when your first condition is met, it never checks the other ones. Try changing it to something like that:

if any(df["DWWC"+str(i)] <=18.25):
// executes if first condition is true
if any((df["DWWC"+str(i)] >  18.25) &  (df["DWWC"+str(i)] <= 36.5)):
// executes if second condition is true, regardless of the first
else:
// all other if's are false

4 Comments

Yes, it is one problem, but solution still not working, because always all data should be overwritten.
@Dor Shinar thank you, Do you think a.any() cause some problems here?
Yes, you are right @jezrael, now the first condition calculation is wrong due to over writing.
any() is used to test an array of conditions to see if at least one of them is truthy. Here though it looks as if you are testing only one condition, so it really doesn't seem necessary. One more thing I've noticed is your use of the & symbols. The & symbols is used for bitwise operations, rather that logical and in python (Unlike other languages you may know). If you wanted to make an and condition, change it to if condition1 and condition2:.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.