2x faster than the versions before mentioned - mostly because transform() and inner_product() loops are joined. Sorry about my shortcut/typedefs/macro: Flo = float. CR const ref. VFlo - vector. Tested in VS2010
#define fe(EL, CONTAINER) for each (auto EL in CONTAINER) //VS2010
Flo stdDev(VFlo CR crVec) {
SZ n = crVec.size(); if (n < 2) return 0.0f;
Flo fSqSum = 0.0f, fSum = 0.0f;
fe(f, crVec) fSqSum += f * f; // EDIT: was Cit(VFlo, crVec) {
fe(f, crVec) fSum += f;
Flo fSumSq = fSum * fSum;
Flo fSumSqDivN = fSumSq / n;
Flo fSubSqSum = fSqSum - fSumSqDivN;
Flo fPreSqrt = fSubSqSum / (n - 1);
return sqrt(fPreSqrt);
}