I want to calculate the max temperature annually using Numpy in Python. I have to calculate max temperature from each year. I can calculate the max temperature for all data included, using np.max(array).
But, how do I calculate it for per year?
I would suggest doing it in pandas (let's assume your array is named x):
df = pd.DataFrame(x, columns=['year', 'month', 'max_temp'])
max_temps_per_year = df.groupby('year')['max_temp'].max()
print(max_temps_per_year)
If you want a pure numpy implementation and assuming your array is named x, you could:
cond = x[:, 0] == 1990 # cond is True/False based on if the year is 1990
sub = x[cond] # subset of x that has only the rows satisfying the condition
max_temp = sub[:, 2].max() # maximum temperature of year
If you do these steps iteratively, then you should get the max temperature for each year:
years = np.unique(x[:, 0]) # find unique years
max_temps = [(x[:, 0] == year)[:, 2].max() for year in years] # same as above written as list comp
# print the results:
print('Year | Max temperature')
for year, temp in zip(years, max_temps):
print('{:^4} | {:^15}'.format(year, temp))
This should work..
#col index of the year field
y=0
#col index of the temperature field
t=2
#Taking unique Years
n=np.unique(arr[:,y])
#Taking max temp for each Year
x=[np.max(arr[arr[:,y]==i,t].astype(np.float)) for i in n]
maxarr=np.append(n,x).reshape(len(n),len(x))
print(maxarr)
Output
[['1990' '1991']
['25.0' '21.0']]
numpy, usepandas. It even has afrom_excelinput method (it's not great, but it's more functional thennumpy)