1

I have a rather big dataframe for my calculation method (SSA) - about 80000 rows. I'd like to average my data by several rows (20 for example). How can I do this?

I have a dataframe, for example:

 1. 00h         03h         06h         09h         12h
10  0.003546    0.000885    0.006852    0.00171     0.001708
11  0.00667     0.012603    0.012933    0.05603     0.025855
12  0.089116    0.054549    0.022177    0.090342    0.070226
13  0.28974     0.246415    0.297231    0.399953    0.287122

And in the end, I'd like something like this:

this

How can I do this?

1 Answer 1

1

Use integer division by range created by length of DataFrame with numpy.arange and aggregate mean:

df = df.groupby(np.arange(len(df))//2).mean()
print (df)

        00h       03h       06h       09h       12h
0  0.005108  0.006744  0.009893  0.028870  0.013782
1  0.189428  0.150482  0.159704  0.245147  0.178674
Sign up to request clarification or add additional context in comments.

1 Comment

And in the case of 80000 rows and 20ty-row averaging, I should use df = df.groupby(np.arange(len(df))//20).mean() ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.