Currently working on a dataset using pandas. Don't have much experience with this sort of stuff so any help would be greatly appreciated. dataset (shown below):
The table shows ratings associated with different segments grouped by year. I am attempting to parse the table and pull the most recent rating from its associated year column (excluding nans), and apply it to its respective place in the Curr_Rate column along with the year the rating was collected in the Curr_RatingYr.
The second task is to pull the second most recent rating (with respective year) and populate these values into the Prev_Rate and PrevRatingYr fields. Finally I need to generate averages from all the ratings available 2000-2017. I have the average part down, but when I try and parse the table to generate values for Current Rating and Previous Rating I am met with:
TypeError stating numpy.float64 object is not callable at index 0
Any help would be greatly appreciated.
df = pd.read_excel('CurrPrevRate1.xlsx')
df.head()
dftest = df[:100]
# Replace zeros with NaN
dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']] = dftest[['y2000','y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']].replace(0, np.nan)
#Change all values in these columns to floats
#dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']] = dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']].apply(pd.to_numeric)
#Get average of rows
dftest['AvgRating'] = dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']].mean(axis=1)
def getCurrRate():
for x in dftest['y2017']:
if 0 <= x <= 10:
return x
else:
for y in dftest['y2016']:
if 0 <= y <= 10:
return y
else:
for z in dftest['y2015']:
if 0 <= z <= 10:
return z
else:
return 'N/A'
dftest['Curr_Rate'] = dftest[['y2000', 'y2001', 'y2002', 'y2003', 'y2004', 'y2005', 'y2006','y2007', 'y2008', 'y2009', 'y2010', 'y2011', 'y2012', 'y2013', 'y2014', 'y2015', 'y2016', 'y2017']].apply(getCurrRate(), axis=1)
dftest
