I was a bit puzzled that norm.fit
apparently only worked with the expanded list of sampled values. I tried giving it two lists of numbers, or lists of tuples, but it only appeared to flatten everything and threat the input as individual samples. Since I already have a histogram based on millions of samples, I didn't want to expand this if I didn't have to. Thankfully, the normal distribution is trivial to calculate, so...
# histogram is [(val,count)]
from math import sqrt
def normfit(hist):
n,s,ss = univar(hist)
mu = s/n
var = ss/n-mu*mu
return (mu, sqrt(var))
def univar(hist):
n = 0
s = 0
ss = 0
for v,c in hist:
n += c
s += c*v
ss += c*v*v
return n, s, ss
I'm sure this must be provided by the libraries, but as I couldn't find it anywhere, I'm posting this here instead. Feel free to point to the correct way to do it and downvote me :-)