2

I am trying to compute a simple moving average for each line of a 2D array. The data in each row is a separate data set, so I can't just compute the SMA over the whole array, I need to do it seperately in each line. I have tried a for loop but it is taking the window as rows, rather than individual values.

The equation I am using to compute the SMA is: a1+a2+...an/n This is the code I have so far:

import numpy as np  


#make amplitude array
amplitude=[0,1,2,3, 5.5, 6,5,2,2, 4, 2,3,1,6.5,5,7,1,2,2,3,8,4,9,2,3,4,8,4,9,3]


#split array up into a line for each sample
traceno=5                  #number of traces in file
samplesno=6                #number of samples in each trace. This wont change.

amplitude_split=np.array(amplitude, dtype=np.int).reshape((traceno,samplesno))

#define window to average over:
window_size=3

#doesn't work for values that come before the window size. i.e. index 2 would not have enough values to divide by 3
#define limits:
lowerlimit=(window_size-1)
upperlimit=samplesno

i=window_size

for row in range(traceno):
  for n in range(samplesno):
    while lowerlimit<i<upperlimit:
      this_window=amplitude_split[(i-window_size):i] 

      window_average=sum(this_window)/window_size

      i+=1
      print(window_average)

My expected output for this data set is:

[[1,    2,    3.33, 4.66]
 [3,    2.66, 2.66, 3.  ]
 [4,    6,    4.33, 3.33]
 [4.33, 5,    7,    5.  ]
 [5,    5.33, 7,    5.33]]

But I am getting:

[2.         3.         3.         4.66666667 2.66666667 3.66666667]
[2.66666667 3.66666667 5.         5.         4.         2.33333333]
[2.         4.33333333 7.         5.         6.33333333 2.33333333]
1
  • Are you required to use only numpy? Commented Apr 30, 2020 at 10:01

2 Answers 2

1

You can use convolution to [1, 1, ..., 1] of window_size and then divide it to window_size to get average (no need for loop):

from scipy.signal import convolve2d

window_average = convolve2d(amplitude_split, np.ones((1, window_size)), 'valid') / window_size)

convolution to ones basically adds up elements in the window.

output:

[[1.         2.         3.33333333 4.66666667]
 [3.         2.66666667 2.66666667 3.        ]
 [4.         6.         4.33333333 3.33333333]
 [4.33333333 5.         7.         5.        ]
 [5.         5.33333333 7.         5.33333333]]
Sign up to request clarification or add additional context in comments.

4 Comments

Hi Thanks for this! I just wanted to check that I understand it correctly: np. ones creates an array that is [1, 1, 1] (this is the convolution kernal?). This then goes through the array and multiplies every window by [1, 1,1 and adds up. In this way we get in the first instance line (0+1+2), we then divide by 3 to get the average. This slides through the array and repeats. Is this correct?
@okvoyce exactly as you explained. it is 2D convolution so goes over each row as well.
Thankyou. Just a quick question: my "amplitude" values correspond to another array which has time values in it. I need to rescale the time array due to this convolution. Are the "unvalid" columns, the last two columns?
@okvoyce It depends which time you would like to assign to each average. say window time is (t1, t2, t3). If you assign t1 to that average, then you are correct and last two columns are invalid. If you prefer to say t2 is a better representative of average of that window, then drop first and last column.
1

That should be easy to compute with np.correlate, using a vector np.ones(window_size) / window_size, but unfortunately that function does not seem to be able to broadcast the correlation operation. So here is another simple way to compute that with np.cumsum:

import numpy as np

amplitude = [  0,   1,   2,   3, 5.5, 6,
               5,   2,   2,   4,   2, 3,
               1, 6.5,   5,   7,   1, 2,
               2,   3,   8,   4,   9, 2,
               3,   4,   8,   4,   9, 3]
traceno = 5
samplesno = 6
amplitude_split = np.array(amplitude, dtype=np.int).reshape((traceno, samplesno))
window_size = 3
# Scale down by window size
a = amplitude_split * (1.0 / window_size)
# Cumsum across columns
b = np.cumsum(a, axis=1)
# Add an initial column of zeros
c = np.pad(b, [(0, 0), (1, 0)])
# Take difference to get means
result = c[:, window_size:] - c[:, :-window_size]
print(result)
# [[1.         2.         3.33333333 4.66666667]
#  [3.         2.66666667 2.66666667 3.        ]
#  [4.         6.         4.33333333 3.33333333]
#  [4.33333333 5.         7.         5.        ]
#  [5.         5.33333333 7.         5.33333333]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.