1

Input df

ID      Date    TAVG  TMAX  TMIN
1   01-01-2020         26    21
2   01-01-2020   15    16    
3   01-01-2020   25    29    18
1   02-01-2020   16          16
2   02-01-2020         26    20
.....

The code I am using

for index, row in df.iterrows():

    if [(row["TMIN"].isnull()) & (row["TAVG"].notnull()) & (row["TMAX"].notnull())]:
        row["TMIN"] = (2 * row["TAVG"]) - row["TMAX"]

    if [(row["TMAX"].isnull()) & (row["TMIN"].notnull()) & (row["TAVG"].notnull())]:
        row["TMAX"] = (2 * row["TAVG"]) - row["TMIN"]

    if [(row["TAVG"].isnull()) & (row["TMIN"].notnull()) & (row["TMAX"].notnull())]:
        row["TAVG"] = (row["TMIN"] + row["TMAX"]) / 2

When I run this, I get the below error:

    if [(row["TMIN"].isnull()) & (row["TAVG"].notnull()) & (row["TMAX"].notnull())]:                                                                                                                                                                    
AttributeError: 'float' object has no attribute 'isnull'  

How to fix this? Any alternate way to achieve the same result?

10
  • For second dupe is a bit changed your solution df['TMIN'] = df['TMIN'].fillna(df['TAVG'] * 2 - df['TMAX']) df['TMAX'] = df['TMAX'].fillna(df['TAVG'] * 2 - df['TMIN']) df['TAVG'] = df['TAVG'].fillna(df[['TMAX', 'TMIN']].mean(axis=1)) Commented Oct 20, 2021 at 10:23
  • 1
    @jezrael I think it would be better if you could provide solutions as answers, comment don't really instill that much confidence in the solution provided. It would also help other beginners too Commented Oct 20, 2021 at 10:29
  • @9769953 df['TMIN'].fillna(df['TAVG'] * 2 - df['TMAX'], inplace=True); Will this also handle nulls in TMIN/TMAX columns? I'm a little doubtful if this would work... Commented Oct 20, 2021 at 10:35
  • 1
    @RoshADM True, it wouldn't. But, consider there is both a null value in TMIN, and a null value in one of the two other columns. You'd be replacing a null value with a null value. The result would still be a null value, which would also be what you had in your original case (if it would work). Commented Oct 20, 2021 at 10:36
  • 1
    @9769953 Also I think inplace is not good practice, check this and this Commented Oct 20, 2021 at 10:43

2 Answers 2

2

.isnull() and .notnull() work on series/columns (or even dataframes. You're accessing an element of a row, that is, a single element (which happens to be a float). That causes the error.

For a lot of cases in Pandas, you shouldn't iterate over the rows individually: work column-wise instead, and skip the loop.

Your particular issue could be translated to be, column-wise:

sel = df['TMIN'].isnull() & df['TAVG'].notnull() & df['TMAX'].notnull()
df.loc[sel, 'TMIN'] = df.loc[sel, 'TAVG'] * 2 - df.loc[sel, 'TMAX']

and similar for the other two columns. All without any iterrows() or other loop.

However, since you are apparently trying to replace NaNs/null values with values from other columns, you can use .fillna() here:

df['TMIN'].fillna(df['TAVG'] * 2 - df['TMAX'], inplace=True)

or if you don't like inplace (because you don't want to change the original dataframe, or want to use the result directly in a chain computation):

df['tmin2'] = df['TMIN'].fillna(df['TAVG'] * 2 - df['TMAX'])

and for the other two columns:

df['tmax2'] = 2 * df['TAVG'] - df['TMIN']
df['tavg2'] = (df['TAVG'] + df['TMIN'])/2

You may ask what happens in a TMIN cell is null, and either the TAVG or TMAX value, or both, is null. In that case, you'd be replacing the null value with null, so nothing happens. Which, given your original if statement, would also be the case in your original code.

Sign up to request clarification or add additional context in comments.

Comments

2

You can also do a row-level check in below fashion i.e.

import pandas as pd

pd.isna(row["TMIN"])

or

pd.isnull(row["TMIN"])

your code will look like,

for index, row in df.iterrows():
if [(pd.isnull(row["TMIN"])) & (pd.notnull(row["TAVG"])) & (pd.notnull(row["TMAX"]))]:
    row["TMIN"] = (2 * row["TAVG"]) - row["TMAX"]

if [(pd.isnull(row["TMAX"])) & (pd.notnull(row["TMIN"])) & (pd.notnull(row["TAVG"]))]:
    row["TMAX"] = (2 * row["TAVG"]) - row["TMIN"]

if [(pd.isnull(row["TAVG"])) & (pd.notnull(row["TMIN"])) & (pd.notnull(row["TMAX"]))]:
    row["TAVG"] = (row["TMIN"] + row["TMAX"]) / 2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.