Compute a confidence interval from sample data

Question

I have sample data which I would like to compute a confidence interval for  assuming a normal distribution   I have found and installed the numpy and scipy packages and have gotten numpy to return a mean and standard deviation  numpy mean data  with data being a list   Any advice on getting a sample confidence interval would be much appreciated

User · Answer

Here a shortened version of shasan s code  calculating the 95  confidence interval of the mean of array a   import numpy as np  scipy stats as st  st t interval 0 95  len a -1  loc np mean a   scale st sem a     But using StatsModels  tconfint mean is arguably even nicer   import statsmodels stats api as sms  sms DescrStatsW a  tconfint mean     The underlying assumptions for both are that the sample  array a  was drawn independently from a normal distribution with unknown standard deviation  see MathWorld or Wikipedia    For large sample size n  the sample mean is normally distributed  and one can calculate its confidence interval using st norm interval    as suggested in Jaime s comment    But the above solutions are correct also for small n  where st norm interval   gives confidence intervals that are too narrow  i e    fake confidence     See my answer to a similar question for more details  and one of Russ s comments here    Here an example where the correct options give  essentially  identical confidence intervals   In  9   a   range 10 14   In  10   mean confidence interval a  Out 10    11 5  9 4457397432391215  13 554260256760879   In  11   st t interval 0 95  len a -1  loc np mean a   scale st sem a   Out 11    9 4457397432391215  13 554260256760879   In  12   sms DescrStatsW a  tconfint mean   Out 12    9 4457397432391197  13 55426025676088    And finally  the incorrect result using st norm interval     In  13   st norm interval 0 95  loc np mean a   scale st sem a   Out 13    10 23484868811834  12 76515131188166

User · Answer

Start with looking up the z-value for your desired confidence interval from a look-up table   The confidence interval is then mean   - z sigma  where sigma is the estimated standard deviation of your sample mean  given by sigma   s   sqrt n   where s is the standard deviation computed from your sample data and n is your sample size

User · Answer

Starting Python 3 8  the standard library provides the NormalDist object as part of the statistics module   from statistics import NormalDist  def confidence interval data  confidence 0 95     dist   NormalDist from samples data    z   NormalDist   inv cdf  1   confidence    2     h   dist stdev   z     len data  - 1      5    return dist mean - h  dist mean   h   This    Creates a NormalDist object from the data sample  NormalDist from samples data   which gives us access to the sample s mean and standard deviation via NormalDist mean and NormalDist stdev  Compute the Z-score based on the standard normal distribution  represented by NormalDist    for the given confidence using the inverse of the cumulative distribution function  inv cdf   Produces the confidence interval based on the sample s standard deviation and mean      This assumes the sample size is big enough  let s say more than  100 points  in order to use the standard normal distribution rather than the student s t distribution to compute the z value

User · Answer

import numpy as np import scipy stats   def mean confidence interval data  confidence 0 95       a   1 0   np array data      n   len a      m  se   np mean a   scipy stats sem a      h   se   scipy stats t ppf  1   confidence    2   n-1      return m  m-h  m h   you can calculate like this way

[python] Compute a confidence interval from sample data

Examples related to python

Examples related to numpy

Examples related to scipy

Examples related to statistics

Examples related to confidence-interval