How to calculate rolling / moving average using NumPy / SciPy?

136

There seems to be no function that simply calculates the moving average on numpy/scipy, leading to convoluted solutions.

My question is two-fold:

• What's the easiest way to (correctly) implement a moving average with numpy?
• Since this seems non-trivial and error prone, is there a good reason not to have the batteries included in this case?

This question is tagged with `python` `numpy` `time-series` `moving-average` `rolling-computation`

~ Asked on 2013-01-14 04:59:12

The Best Answer is

111

A simple way to achieve this is by using `np.convolve`. The idea behind this is to leverage the way the discrete convolution is computed and use it to return a rolling mean. This can be done by convolving with a sequence of `np.ones` of a length equal to the sliding window length we want.

In order to do so we could define the following function:

``````def moving_average(x, w):
return np.convolve(x, np.ones(w), 'valid') / w
``````

This function will be taking the convolution of the sequence `x` and a sequence of ones of length `w`. Note that the chosen `mode` is `valid` so that the convolution product is only given for points where the sequences overlap completely.

Some examples:

``````x = np.array([5,3,8,10,2,1,5,1,0,2])
``````

For a moving average with a window of length `2` we would have:

``````moving_average(x, 2)
# array([4. , 5.5, 9. , 6. , 1.5, 3. , 3. , 0.5, 1. ])
``````

And for a window of length `4`:

``````moving_average(x, 4)
# array([6.5 , 5.75, 5.25, 4.5 , 2.25, 1.75, 2.  ])
``````

How does `convolve` work?

Lets have a more in depth look at the way the discrete convolution is being computed. The following function aims to replicate the way `np.convolve` is computing the output values:

``````def mov_avg(x, w):
for m in range(len(x)-(w-1)):
yield sum(np.ones(w) * x[m:m+w]) / w
``````

Which, for the same example above would also yield:

``````list(mov_avg(x, 2))
# [4.0, 5.5, 9.0, 6.0, 1.5, 3.0, 3.0, 0.5, 1.0]
``````

So what is being done at each step is to take the inner product between the array of ones and the current window. In this case the multiplication by `np.ones(w)` is superfluous given that we are directly taking the `sum` of the sequence.

Bellow is an example of how the first outputs are computed so that it is a little clearer. Lets suppose we want a window of `w=4`:

``````[1,1,1,1]
[5,3,8,10,2,1,5,1,0,2]
= (1*5 + 1*3 + 1*8 + 1*10) / w = 6.5
``````

And the following output would be computed as:

``````  [1,1,1,1]
[5,3,8,10,2,1,5,1,0,2]
= (1*3 + 1*8 + 1*10 + 1*2) / w = 5.75
``````

And so on, returning a moving average of the sequence once all overlaps have been performed.

~ Answered on 2019-02-11 10:11:13

183

If you just want a straightforward non-weighted moving average, you can easily implement it with `np.cumsum`, which may be is faster than FFT based methods:

EDIT Corrected an off-by-one wrong indexing spotted by Bean in the code. EDIT

``````def moving_average(a, n=3) :
ret = np.cumsum(a, dtype=float)
ret[n:] = ret[n:] - ret[:-n]
return ret[n - 1:] / n

>>> a = np.arange(20)
>>> moving_average(a)
array([  1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,  11.,
12.,  13.,  14.,  15.,  16.,  17.,  18.])
>>> moving_average(a, n=4)
array([  1.5,   2.5,   3.5,   4.5,   5.5,   6.5,   7.5,   8.5,   9.5,
10.5,  11.5,  12.5,  13.5,  14.5,  15.5,  16.5,  17.5])
``````

So I guess the answer is: it is really easy to implement, and maybe numpy is already a little bloated with specialized functionality.

~ Answered on 2013-01-14 06:15:57