[moving-average] How to calculate moving average without keeping the count and data-total?

I am trying to find a way to calculate a moving cumulative average without storing the count and total data that is received so far.

I came up with two algorithms but both need to store the count:

  • new average = ((old count * old data) + next data) / next count
  • new average = old average + (next data - old average) / next count

The problem with these methods is that the count gets bigger and bigger resulting in losing precision in the resulting average.

The first method uses the old count and next count which are obviously 1 apart. This got me thinking that perhaps there is a way to remove the count but unfortunately I haven't found it yet. It did get me a bit further though, resulting in the second method but still count is present.

Is it possible, or am I just searching for the impossible?

This question is related to moving-average

The answer is


An example using javascript, for comparison:

https://jsfiddle.net/drzaus/Lxsa4rpz/

function calcNormalAvg(list) {
    // sum(list) / len(list)
    return list.reduce(function(a, b) { return a + b; }) / list.length;
}
function calcRunningAvg(previousAverage, currentNumber, index) {
    // [ avg' * (n-1) + x ] / n
    return ( previousAverage * (index - 1) + currentNumber ) / index;
}

_x000D_
_x000D_
(function(){_x000D_
  // populate base list_x000D_
var list = [];_x000D_
function getSeedNumber() { return Math.random()*100; }_x000D_
for(var i = 0; i < 50; i++) list.push( getSeedNumber() );_x000D_
_x000D_
  // our calculation functions, for comparison_x000D_
function calcNormalAvg(list) {_x000D_
   // sum(list) / len(list)_x000D_
 return list.reduce(function(a, b) { return a + b; }) / list.length;_x000D_
}_x000D_
function calcRunningAvg(previousAverage, currentNumber, index) {_x000D_
   // [ avg' * (n-1) + x ] / n_x000D_
 return ( previousAverage * (index - 1) + currentNumber ) / index;_x000D_
}_x000D_
  function calcMovingAvg(accumulator, new_value, alpha) {_x000D_
   return (alpha * new_value) + (1.0 - alpha) * accumulator;_x000D_
}_x000D_
_x000D_
  // start our baseline_x000D_
var baseAvg = calcNormalAvg(list);_x000D_
var runningAvg = baseAvg, movingAvg = baseAvg;_x000D_
console.log('base avg: %d', baseAvg);_x000D_
  _x000D_
  var okay = true;_x000D_
  _x000D_
  // table of output, cleaner console view_x000D_
  var results = [];_x000D_
_x000D_
  // add 10 more numbers to the list and compare calculations_x000D_
for(var n = list.length, i = 0; i < 10; i++, n++) {_x000D_
 var newNumber = getSeedNumber();_x000D_
_x000D_
 runningAvg = calcRunningAvg(runningAvg, newNumber, n+1);_x000D_
 movingAvg = calcMovingAvg(movingAvg, newNumber, 1/(n+1));_x000D_
_x000D_
 list.push(newNumber);_x000D_
 baseAvg = calcNormalAvg(list);_x000D_
_x000D_
 // assert and inspect_x000D_
 console.log('added [%d] to list at pos %d, running avg = %d vs. regular avg = %d (%s), vs. moving avg = %d (%s)'_x000D_
  , newNumber, list.length, runningAvg, baseAvg, runningAvg == baseAvg, movingAvg, movingAvg == baseAvg_x000D_
 )_x000D_
results.push( {x: newNumber, n:list.length, regular: baseAvg, running: runningAvg, moving: movingAvg, eqRun: baseAvg == runningAvg, eqMov: baseAvg == movingAvg } );_x000D_
_x000D_
if(runningAvg != baseAvg) console.warn('Fail!');_x000D_
okay = okay && (runningAvg == baseAvg);    _x000D_
}_x000D_
  _x000D_
  console.log('Everything matched for running avg? %s', okay);_x000D_
  if(console.table) console.table(results);_x000D_
})();
_x000D_
_x000D_
_x000D_


New average = old average * (n-1)/n + new value /n

This is assuming the count only changed by one value. In case it is changed by M values then:

new average = old average * (n-len(M))/n + (sum of values in M)/n).

This is the mathematical formula (I believe the most efficient one), believe you can do further code by yourselves


A neat Python solution based on the above answers:

class RunningAverage():
    def __init__(self):
        self.average = 0
        self.n = 0
        
    def __call__(self, new_value):
        self.n += 1
        self.average = (self.average * (self.n-1) + new_value) / self.n 
        
    def __float__(self):
        return self.average
    
    def __repr__(self):
        return "average: " + str(self.average)

usage:

x = RunningAverage()
x(0)
x(2)
x(4)
print(x)

Here's yet another answer offering commentary on how Muis, Abdullah Al-Ageel and Flip's answer are all mathematically the same thing except written differently.

Sure, we have José Manuel Ramos's analysis explaining how rounding errors affect each slightly differently, but that's implementation dependent and would change based on how each answer were applied to code.

There is however a rather big difference

It's in Muis's N, Flip's k, and Abdullah Al-Ageel's n. Abdullah Al-Ageel doesn't quite explain what n should be, but N and k differ in that N is "the number of samples where you want to average over" while k is the count of values sampled. (Although I have doubts to whether calling N the number of samples is accurate.)

And here we come to the answer below. It's essentially the same old exponential weighted moving average as the others, so if you were looking for an alternative, stop right here.

Exponential weighted moving average

Initially:

average = 0
counter = 0

For each value:

counter += 1
average = average + (value - average) / min(counter, FACTOR)

The difference is the min(counter, FACTOR) part. This is the same as saying min(Flip's k, Muis's N).

FACTOR is a constant that affects how quickly the average "catches up" to the latest trend. Smaller the number the faster. (At 1 it's no longer an average and just becomes the latest value.)

This answer requires the running counter counter. If problematic, the min(counter, FACTOR) can be replaced with just FACTOR, turning it into Muis's answer. The problem with doing this is the moving average is affected by whatever average is initiallized to. If it was initialized to 0, that zero can take a long time to work its way out of the average.

How it ends up looking

Exponential moving average


You can simply do:

double approxRollingAverage (double avg, double new_sample) {

    avg -= avg / N;
    avg += new_sample / N;

    return avg;
}

Where N is the number of samples where you want to average over. Note that this approximation is equivalent to an exponential moving average. See: Calculate rolling / moving average in C++


The answer of Flip is computationally more consistent than the Muis one.

Using double number format, you could see the roundoff problem in the Muis approach:

The Muis approach

When you divide and subtract, a roundoff appears in the previous stored value, changing it.

However, the Flip approach preserves the stored value and reduces the number of divisions, hence, reducing the roundoff, and minimizing the error propagated to the stored value. Adding only will bring up roundoffs if there is something to add (when N is big, there is nothing to add)

The Flip approach

Those changes are remarkable when you make a mean of big values tend their mean to zero.

I show you the results using a spreadsheet program:

Firstly, the results obtained: Results

The A and B columns are the n and X_n values, respectively.

The C column is the Flip approach, and the D one is the Muis approach, the result stored in the mean. The E column corresponds with the medium value used in the computation.

A graph showing the mean of even values is the next one:

Graph

As you can see, there is big differences between both approachs.


From a blog on running sample variance calculations, where the mean is also calculated using Welford's method:

enter image description here

Too bad we can't upload SVG images.


In Java8:

LongSummaryStatistics movingAverage = new LongSummaryStatistics();
movingAverage.accept(new data);
...
average = movingAverage.getAverage();

you have also IntSummaryStatistics, DoubleSummaryStatistics ...