Calculate mean and standard deviation from a vector of samples in C using Boost

Question

Is there a way to calculate mean and standard deviation for a vector containing samples using Boost    Or do I have to create an accumulator and feed the vector into it

User · Answer

2x faster than the versions before mentioned - mostly because transform   and inner product   loops are joined   Sorry about my shortcut typedefs macro  Flo   float  CR const ref  VFlo - vector  Tested in VS2010   define fe EL  CONTAINER    for each  auto EL in CONTAINER     VS2010 Flo stdDev VFlo CR crVec        SZ  n   crVec size                  if  n  lt  2  return 0 0f      Flo fSqSum   0 0f  fSum   0 0f      fe f  crVec  fSqSum    f   f           EDIT  was Cit VFlo  crVec        fe f  crVec  fSum      f      Flo fSumSq        fSum   fSum      Flo fSumSqDivN    fSumSq   n      Flo fSubSqSum     fSqSum - fSumSqDivN      Flo fPreSqrt      fSubSqSum    n - 1       return sqrt fPreSqrt

User · Answer

means deviation in c     A deviation that is a difference between an observed value and the true value of a quantity of interest  such as a population mean  is an error and a deviation that is the difference between the observed value and an estimate of the true value  such an estimate may be a sample mean  is a residual  These concepts are applicable for data at the interval and ratio levels of measurement     include  lt iostream gt   include  lt conio h gt  using namespace std      run this program using the console pauser or add your own getch      system  pause   or input loop     int main int argc  char   argv    int i cnt  cout lt  lt  please inter count  t   cin gt  gt cnt  float  num new float  cnt   float    s new float  cnt   float sum 0 ave M M D   for i 0 i lt cnt i          cin gt  gt num i       sum  num i         ave sum cnt  for i 0 i lt cnt i      s i  ave-num i       if s i  lt 0    s i  s i   -1      cout lt  lt   n ave - number      lt  lt s i     M  s i         M D M cnt  cout lt  lt   n n Average                lt  lt ave  cout lt  lt   n M D Mean Deviation     lt  lt M D  getch    return 0

User · Answer

It seems the following elegant recursive solution has not been mentioned  although it has been around for a long time  Referring to Knuth s Art of Computer Programming    mean 1   x 1  variance 1   0               initial conditions  edge case     for k  gt   2   mean k       mean k-1    x k - mean k-1    k  variance k   variance k-1    x k - mean k-1     x k - mean k     then for a list of n gt  2 values  the estimate of the standard deviation is    stddev   std  sqrt variance n    n-1       Hope this helps

User · Answer

Improving on the answer by musiphil  you can write a standard deviation function without the temporary vector diff  just using a single inner product call with the C  11 lambda capabilities  double stddev std  vector lt double gt  const  amp  func        double mean   std  accumulate func begin    func end    0 0    func size        double sq sum   std  inner product func begin    func end    func begin    0 0             double const  amp  x  double const  amp  y    return x   y              mean  double const  amp  x  double const  amp  y    return  x - mean   y - mean           return std  sqrt sq sum   func size        I suspect doing the subtraction multiple times is cheaper than using up additional intermediate storage  and I think it is more readable  but I haven t tested the performance yet

User · Answer

I don t know if Boost has more specific functions  but you can do it with the standard library   Given std  vector lt double gt  v  this is the naive way    include  lt numeric gt   double sum   std  accumulate v begin    v end    0 0   double mean   sum   v size     double sq sum   std  inner product v begin    v end    v begin    0 0   double stdev   std  sqrt sq sum   v size   - mean   mean     This is susceptible to overflow or underflow for huge or tiny values  A slightly better way to calculate the standard deviation is   double sum   std  accumulate v begin    v end    0 0   double mean   sum   v size     std  vector lt double gt  diff v size     std  transform v begin    v end    diff begin                   std  bind2nd std  minus lt double gt     mean    double sq sum   std  inner product diff begin    diff end    diff begin    0 0   double stdev   std  sqrt sq sum   v size       UPDATE for C  11   The call to std  transform can be written using a lambda function instead of std  minus and std  bind2nd now deprecated    std  transform v begin    v end    diff begin     mean  double x    return x - mean

User · Answer

Using accumulators is the way to compute means and standard deviations in Boost   accumulator set lt double  stats lt tag  variance gt   gt  acc  for each a vec begin    a vec end    bind lt void gt  ref acc    1     cout  lt  lt  mean acc   lt  lt  endl  cout  lt  lt  sqrt variance acc    lt  lt  endl     nbsp

User · Answer

If performance is important to you  and your compiler supports lambdas  the stdev calculation can be made faster and simpler  In tests with VS 2012 I ve found that the following code is over 10 X quicker than the Boost code given in the chosen answer  it s also 5 X quicker than the safer version of the answer using standard libraries given by musiphil   Note I m using sample standard deviation  so the below code gives slightly different results  Why there is a Minus One in Standard Deviations   double sum   std  accumulate std  begin v   std  end v   0 0   double m    sum   v size     double accum   0 0  std  for each  std  begin v   std  end v     amp   const double d        accum     d - m     d - m        double stdev   sqrt accum    v size  -1

User · Answer

Create your own container   template  lt class T gt  class statList   public std  list lt T gt        public          statList     std  list lt T gt   list               statList              T mean                return accumulate begin   end   0 0  size                      T stddev                T diff sum   0             T m   mean               for iterator it  begin    it    end      it                 diff sum       it - m    it -m               return diff sum size                   It does have some limitations  but it works beautifully when you know what you are doing

User · Answer

My answer is similar as Josh Greifer but generalised to sample covariance  Sample variance is just sample covariance but with the two inputs identical  This includes Bessel s correlation       template  lt class Iter gt  typename Iter  value type cov const Iter  amp x  const Iter  amp y                double sum x   std  accumulate std  begin x   std  end x   0 0           double sum y   std  accumulate std  begin y   std  end y   0 0            double mx    sum x   x size            double my    sum y   y size             double accum   0 0           for  auto i   0  i  lt  x size    i                          accum     x at i  - mx     y at i  - my                      return accum    x size   - 1

[c++] Calculate mean and standard deviation from a vector of samples in C++ using Boost

Examples related to c++

Examples related to algorithm

Examples related to boost

Examples related to statistics

Examples related to mean