TypeError: Numpy.float64 object is not callable iterating rows pandas dataframe

Question

Currently working on a dataset using pandas. Don't have much experience with this sort of stuff so any help would be greatly appreciated. dataset (shown below):

The table shows ratings associated with different segments grouped by year. I am attempting to parse the table and pull the most recent rating from its associated year column (excluding nans), and apply it to its respective place in the Curr_Rate column along with the year the rating was collected in the Curr_RatingYr.

The second task is to pull the second most recent rating (with respective year) and populate these values into the Prev_Rate and PrevRatingYr fields. Finally I need to generate averages from all the ratings available 2000-2017. I have the average part down, but when I try and parse the table to generate values for Current Rating and Previous Rating I am met with:

TypeError stating numpy.float64 object is not callable at index 0

Any help would be greatly appreciated.

df = pd.read_excel('CurrPrevRate1.xlsx')

df.head()

dftest = df[:100]

    # Replace zeros with NaN
    dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']] = dftest[['y2000','y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']].replace(0, np.nan)

    #Change all values in these columns to floats
    #dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']] = dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']].apply(pd.to_numeric)

    #Get average of rows 
    dftest['AvgRating'] = dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']].mean(axis=1)

    def getCurrRate():
        for x in dftest['y2017']:
            if 0 <= x <= 10:
                return x
            else:
                for y in dftest['y2016']:
                    if 0 <= y <= 10:
                        return y
                    else:
                        for z in dftest['y2015']:
                            if 0 <= z <= 10:
                                return z
                            else:
                                return 'N/A'

    dftest['Curr_Rate'] = dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']].apply(getCurrRate(), axis=1)

    dftest

Can you provide (a) actual, inline data instead of a screenshot, and (b) expected input and output? More generally, you'll find you get better help, faster, when you post a Minimum, Complete, and Verifiable Example. — andrew_reece
– andrew_reece, Commented Aug 25, 2017 at 16:40

andrew_reece · Accepted Answer · 2017-08-25 16:39:00Z

The error seems related to your apply() syntax.

Call apply() with a function name, no () on the end. E.g. apply(getCurrRate, axis=1).
The function you apply your data to usually takes an argument, e.g. getCurrRate(yr). Here, yr is the object passed implicitly from apply(), so with axis=1 you'd be executing:
```
getCurrRate(dftest.y2000)
getCurrRate(dftest.y2001)
#...
getCurrRate(dftest.y2017)
```
But without a parameter in your getCurrRate definition, apply() doesn't have anything to apply on.

At least for the case of currRate, it seems like you really just want the most recent, non-NaN value from the y<year> columns. In that case, consider a simpler approach:

def getCurrRate(yr):
    return yr.dropna()[-1]

ratings_cols = df.columns[df.columns.str.startswith('y')]
df['currRate'] = df[ratings_cols].apply(getCurrRate, axis=1)

Here's some toy data to demonstrate:

data = {'segmentId':['foo','bar','baz'],
        'y2015':[5, 6, 7],
        'y2016':[2, np.nan, 4],
        'y2017':[np.nan, np.nan, 9]}
df = pd.DataFrame(data)

df
  segmentId  y2015  y2016  y2017
0       foo      5    2.0    NaN
1       bar      6    NaN    NaN
2       baz      7    4.0    9.0

We'd expect the following values for currRate:

index 0: 2
index 1: 6
index 2: 9

And that's what we get with the new getCurrRate:

df['currRate'] = df[ratings_cols].apply(getCurrRate, axis=1)

df
  segmentId  y2015  y2016  y2017  currRate
0       foo      5    2.0    NaN       2.0
1       bar      6    NaN    NaN       6.0
2       baz      7    4.0    9.0       9.0

Collectives™ on Stack Overflow

TypeError: Numpy.float64 object is not callable iterating rows pandas dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related