[algorithm] Converting a Uniform Distribution to a Normal Distribution

How can I convert a uniform distribution (as most random number generators produce, e.g. between 0.0 and 1.0) into a normal distribution? What if I want a mean and standard deviation of my choosing?

The answer is


I thing you should try this in EXCEL: =norminv(rand();0;1). This will product the random numbers which should be normally distributed with the zero mean and unite variance. "0" can be supplied with any value, so that the numbers will be of desired mean, and by changing "1", you will get the variance equal to the square of your input.

For example: =norminv(rand();50;3) will yield to the normally distributed numbers with MEAN = 50 VARIANCE = 9.


Use the central limit theorem wikipedia entry mathworld entry to your advantage.

Generate n of the uniformly distributed numbers, sum them, subtract n*0.5 and you have the output of an approximately normal distribution with mean equal to 0 and variance equal to (1/12) * (1/sqrt(N)) (see wikipedia on uniform distributions for that last one)

n=10 gives you something half decent fast. If you want something more than half decent go for tylers solution (as noted in the wikipedia entry on normal distributions)


The standard Python library module random has what you want:

normalvariate(mu, sigma)
Normal distribution. mu is the mean, and sigma is the standard deviation.

For the algorithm itself, take a look at the function in random.py in the Python library.

The manual entry is here


Changing the distribution of any function to another involves using the inverse of the function you want.

In other words, if you aim for a specific probability function p(x) you get the distribution by integrating over it -> d(x) = integral(p(x)) and use its inverse: Inv(d(x)). Now use the random probability function (which have uniform distribution) and cast the result value through the function Inv(d(x)). You should get random values cast with distribution according to the function you chose.

This is the generic math approach - by using it you can now choose any probability or distribution function you have as long as it have inverse or good inverse approximation.

Hope this helped and thanks for the small remark about using the distribution and not the probability itself.


I have the following code which maybe could help:

set.seed(123)
n <- 1000
u <- runif(n) #creates U
x <- -log(u)
y <- runif(n, max=u*sqrt((2*exp(1))/pi)) #create Y
z <- ifelse (y < dnorm(x)/2, -x, NA)
z <- ifelse ((y > dnorm(x)/2) & (y < dnorm(x)), x, z)
z <- z[!is.na(z)]

I would use Box-Muller. Two things about this:

  1. You end up with two values per iteration
    Typically, you cache one value and return the other. On the next call for a sample, you return the cached value.
  2. Box-Muller gives a Z-score
    You have to then scale the Z-score by the standard deviation and add the mean to get the full value in the normal distribution.

Q How can I convert a uniform distribution (as most random number generators produce, e.g. between 0.0 and 1.0) into a normal distribution?

  1. For software implementation I know couple random generator names which give you a pseudo uniform random sequence in [0,1] (Mersenne Twister, Linear Congruate Generator). Let's call it U(x)

  2. It is exist mathematical area which called probibility theory. First thing: If you want to model r.v. with integral distribution F then you can try just to evaluate F^-1(U(x)). In pr.theory it was proved that such r.v. will have integral distribution F.

  3. Step 2 can be appliable to generate r.v.~F without usage of any counting methods when F^-1 can be derived analytically without problems. (e.g. exp.distribution)

  4. To model normal distribution you can cacculate y1*cos(y2), where y1~is uniform in[0,2pi]. and y2 is the relei distribution.

Q: What if I want a mean and standard deviation of my choosing?

You can calculate sigma*N(0,1)+m.

It can be shown that such shifting and scaling lead to N(m,sigma)


It seems incredible that I could add something to this after eight years, but for the case of Java I would like to point readers to the Random.nextGaussian() method, which generates a Gaussian distribution with mean 0.0 and standard deviation 1.0 for you.

A simple addition and/or multiplication will change the mean and standard deviation to your needs.


There are plenty of methods:

  • Do not use Box Muller. Especially if you draw many gaussian numbers. Box Muller yields a result which is clamped between -6 and 6 (assuming double precision. Things worsen with floats.). And it is really less efficient than other available methods.
  • Ziggurat is fine, but needs a table lookup (and some platform-specific tweaking due to cache size issues)
  • Ratio-of-uniforms is my favorite, only a few addition/multiplications and a log 1/50th of the time (eg. look there).
  • Inverting the CDF is efficient (and overlooked, why ?), you have fast implementations of it available if you search google. It is mandatory for Quasi-Random numbers.

function distRandom(){
  do{
    x=random(DISTRIBUTION_DOMAIN);
  }while(random(DISTRIBUTION_RANGE)>=distributionFunction(x));
  return x;
}

The standard Python library module random has what you want:

normalvariate(mu, sigma)
Normal distribution. mu is the mean, and sigma is the standard deviation.

For the algorithm itself, take a look at the function in random.py in the Python library.

The manual entry is here


Q How can I convert a uniform distribution (as most random number generators produce, e.g. between 0.0 and 1.0) into a normal distribution?

  1. For software implementation I know couple random generator names which give you a pseudo uniform random sequence in [0,1] (Mersenne Twister, Linear Congruate Generator). Let's call it U(x)

  2. It is exist mathematical area which called probibility theory. First thing: If you want to model r.v. with integral distribution F then you can try just to evaluate F^-1(U(x)). In pr.theory it was proved that such r.v. will have integral distribution F.

  3. Step 2 can be appliable to generate r.v.~F without usage of any counting methods when F^-1 can be derived analytically without problems. (e.g. exp.distribution)

  4. To model normal distribution you can cacculate y1*cos(y2), where y1~is uniform in[0,2pi]. and y2 is the relei distribution.

Q: What if I want a mean and standard deviation of my choosing?

You can calculate sigma*N(0,1)+m.

It can be shown that such shifting and scaling lead to N(m,sigma)


Here is a javascript implementation using the polar form of the Box-Muller transformation.

/*
 * Returns member of set with a given mean and standard deviation
 * mean: mean
 * standard deviation: std_dev 
 */
function createMemberInNormalDistribution(mean,std_dev){
    return mean + (gaussRandom()*std_dev);
}

/*
 * Returns random number in normal distribution centering on 0.
 * ~95% of numbers returned should fall between -2 and 2
 * ie within two standard deviations
 */
function gaussRandom() {
    var u = 2*Math.random()-1;
    var v = 2*Math.random()-1;
    var r = u*u + v*v;
    /*if outside interval [0,1] start over*/
    if(r == 0 || r >= 1) return gaussRandom();

    var c = Math.sqrt(-2*Math.log(r)/r);
    return u*c;

    /* todo: optimize this algorithm by caching (v*c) 
     * and returning next time gaussRandom() is called.
     * left out for simplicity */
}

The standard Python library module random has what you want:

normalvariate(mu, sigma)
Normal distribution. mu is the mean, and sigma is the standard deviation.

For the algorithm itself, take a look at the function in random.py in the Python library.

The manual entry is here


I have the following code which maybe could help:

set.seed(123)
n <- 1000
u <- runif(n) #creates U
x <- -log(u)
y <- runif(n, max=u*sqrt((2*exp(1))/pi)) #create Y
z <- ifelse (y < dnorm(x)/2, -x, NA)
z <- ifelse ((y > dnorm(x)/2) & (y < dnorm(x)), x, z)
z <- z[!is.na(z)]

Here is a javascript implementation using the polar form of the Box-Muller transformation.

/*
 * Returns member of set with a given mean and standard deviation
 * mean: mean
 * standard deviation: std_dev 
 */
function createMemberInNormalDistribution(mean,std_dev){
    return mean + (gaussRandom()*std_dev);
}

/*
 * Returns random number in normal distribution centering on 0.
 * ~95% of numbers returned should fall between -2 and 2
 * ie within two standard deviations
 */
function gaussRandom() {
    var u = 2*Math.random()-1;
    var v = 2*Math.random()-1;
    var r = u*u + v*v;
    /*if outside interval [0,1] start over*/
    if(r == 0 || r >= 1) return gaussRandom();

    var c = Math.sqrt(-2*Math.log(r)/r);
    return u*c;

    /* todo: optimize this algorithm by caching (v*c) 
     * and returning next time gaussRandom() is called.
     * left out for simplicity */
}

This is my JavaScript implementation of Algorithm P (Polar method for normal deviates) from Section 3.4.1 of Donald Knuth's book The Art of Computer Programming:

function normal_random(mean,stddev)
{
    var V1
    var V2
    var S
    do{
        var U1 = Math.random() // return uniform distributed in [0,1[
        var U2 = Math.random()
        V1 = 2*U1-1
        V2 = 2*U2-1
        S = V1*V1+V2*V2
    }while(S >= 1)
    if(S===0) return 0
    return mean+stddev*(V1*Math.sqrt(-2*Math.log(S)/S))
}

function distRandom(){
  do{
    x=random(DISTRIBUTION_DOMAIN);
  }while(random(DISTRIBUTION_RANGE)>=distributionFunction(x));
  return x;
}

Use the central limit theorem wikipedia entry mathworld entry to your advantage.

Generate n of the uniformly distributed numbers, sum them, subtract n*0.5 and you have the output of an approximately normal distribution with mean equal to 0 and variance equal to (1/12) * (1/sqrt(N)) (see wikipedia on uniform distributions for that last one)

n=10 gives you something half decent fast. If you want something more than half decent go for tylers solution (as noted in the wikipedia entry on normal distributions)


I would use Box-Muller. Two things about this:

  1. You end up with two values per iteration
    Typically, you cache one value and return the other. On the next call for a sample, you return the cached value.
  2. Box-Muller gives a Z-score
    You have to then scale the Z-score by the standard deviation and add the mean to get the full value in the normal distribution.

The Ziggurat algorithm is pretty efficient for this, although the Box-Muller transform is easier to implement from scratch (and not crazy slow).


I thing you should try this in EXCEL: =norminv(rand();0;1). This will product the random numbers which should be normally distributed with the zero mean and unite variance. "0" can be supplied with any value, so that the numbers will be of desired mean, and by changing "1", you will get the variance equal to the square of your input.

For example: =norminv(rand();50;3) will yield to the normally distributed numbers with MEAN = 50 VARIANCE = 9.


Use the central limit theorem wikipedia entry mathworld entry to your advantage.

Generate n of the uniformly distributed numbers, sum them, subtract n*0.5 and you have the output of an approximately normal distribution with mean equal to 0 and variance equal to (1/12) * (1/sqrt(N)) (see wikipedia on uniform distributions for that last one)

n=10 gives you something half decent fast. If you want something more than half decent go for tylers solution (as noted in the wikipedia entry on normal distributions)


It seems incredible that I could add something to this after eight years, but for the case of Java I would like to point readers to the Random.nextGaussian() method, which generates a Gaussian distribution with mean 0.0 and standard deviation 1.0 for you.

A simple addition and/or multiplication will change the mean and standard deviation to your needs.


Changing the distribution of any function to another involves using the inverse of the function you want.

In other words, if you aim for a specific probability function p(x) you get the distribution by integrating over it -> d(x) = integral(p(x)) and use its inverse: Inv(d(x)). Now use the random probability function (which have uniform distribution) and cast the result value through the function Inv(d(x)). You should get random values cast with distribution according to the function you chose.

This is the generic math approach - by using it you can now choose any probability or distribution function you have as long as it have inverse or good inverse approximation.

Hope this helped and thanks for the small remark about using the distribution and not the probability itself.


Use the central limit theorem wikipedia entry mathworld entry to your advantage.

Generate n of the uniformly distributed numbers, sum them, subtract n*0.5 and you have the output of an approximately normal distribution with mean equal to 0 and variance equal to (1/12) * (1/sqrt(N)) (see wikipedia on uniform distributions for that last one)

n=10 gives you something half decent fast. If you want something more than half decent go for tylers solution (as noted in the wikipedia entry on normal distributions)


It is also easier to use the implemented function rnorm() since it is faster than writing a random number generator for the normal distribution. See the following code as prove

n <- length(z)
t0 <- Sys.time()
z <- rnorm(n)
t1 <- Sys.time()
t1-t0

Changing the distribution of any function to another involves using the inverse of the function you want.

In other words, if you aim for a specific probability function p(x) you get the distribution by integrating over it -> d(x) = integral(p(x)) and use its inverse: Inv(d(x)). Now use the random probability function (which have uniform distribution) and cast the result value through the function Inv(d(x)). You should get random values cast with distribution according to the function you chose.

This is the generic math approach - by using it you can now choose any probability or distribution function you have as long as it have inverse or good inverse approximation.

Hope this helped and thanks for the small remark about using the distribution and not the probability itself.


Where R1, R2 are random uniform numbers:

NORMAL DISTRIBUTION, with SD of 1:

sqrt(-2*log(R1))*cos(2*pi*R2)

This is exact... no need to do all those slow loops!

Reference: dspguide.com/ch2/6.htm


function distRandom(){
  do{
    x=random(DISTRIBUTION_DOMAIN);
  }while(random(DISTRIBUTION_RANGE)>=distributionFunction(x));
  return x;
}

This is my JavaScript implementation of Algorithm P (Polar method for normal deviates) from Section 3.4.1 of Donald Knuth's book The Art of Computer Programming:

function normal_random(mean,stddev)
{
    var V1
    var V2
    var S
    do{
        var U1 = Math.random() // return uniform distributed in [0,1[
        var U2 = Math.random()
        V1 = 2*U1-1
        V2 = 2*U2-1
        S = V1*V1+V2*V2
    }while(S >= 1)
    if(S===0) return 0
    return mean+stddev*(V1*Math.sqrt(-2*Math.log(S)/S))
}

Changing the distribution of any function to another involves using the inverse of the function you want.

In other words, if you aim for a specific probability function p(x) you get the distribution by integrating over it -> d(x) = integral(p(x)) and use its inverse: Inv(d(x)). Now use the random probability function (which have uniform distribution) and cast the result value through the function Inv(d(x)). You should get random values cast with distribution according to the function you chose.

This is the generic math approach - by using it you can now choose any probability or distribution function you have as long as it have inverse or good inverse approximation.

Hope this helped and thanks for the small remark about using the distribution and not the probability itself.


It is also easier to use the implemented function rnorm() since it is faster than writing a random number generator for the normal distribution. See the following code as prove

n <- length(z)
t0 <- Sys.time()
z <- rnorm(n)
t1 <- Sys.time()
t1-t0

Here is a javascript implementation using the polar form of the Box-Muller transformation.

/*
 * Returns member of set with a given mean and standard deviation
 * mean: mean
 * standard deviation: std_dev 
 */
function createMemberInNormalDistribution(mean,std_dev){
    return mean + (gaussRandom()*std_dev);
}

/*
 * Returns random number in normal distribution centering on 0.
 * ~95% of numbers returned should fall between -2 and 2
 * ie within two standard deviations
 */
function gaussRandom() {
    var u = 2*Math.random()-1;
    var v = 2*Math.random()-1;
    var r = u*u + v*v;
    /*if outside interval [0,1] start over*/
    if(r == 0 || r >= 1) return gaussRandom();

    var c = Math.sqrt(-2*Math.log(r)/r);
    return u*c;

    /* todo: optimize this algorithm by caching (v*c) 
     * and returning next time gaussRandom() is called.
     * left out for simplicity */
}

The Ziggurat algorithm is pretty efficient for this, although the Box-Muller transform is easier to implement from scratch (and not crazy slow).


This is a Matlab implementation using the polar form of the Box-Muller transformation:

Function randn_box_muller.m:

function [values] = randn_box_muller(n, mean, std_dev)
    if nargin == 1
       mean = 0;
       std_dev = 1;
    end

    r = gaussRandomN(n);
    values = r.*std_dev - mean;
end

function [values] = gaussRandomN(n)
    [u, v, r] = gaussRandomNValid(n);

    c = sqrt(-2*log(r)./r);
    values = u.*c;
end

function [u, v, r] = gaussRandomNValid(n)
    r = zeros(n, 1);
    u = zeros(n, 1);
    v = zeros(n, 1);

    filter = r==0 | r>=1;

    % if outside interval [0,1] start over
    while n ~= 0
        u(filter) = 2*rand(n, 1)-1;
        v(filter) = 2*rand(n, 1)-1;
        r(filter) = u(filter).*u(filter) + v(filter).*v(filter);

        filter = r==0 | r>=1;
        n = size(r(filter),1);
    end
end

And invoking histfit(randn_box_muller(10000000),100); this is the result: Box-Muller Matlab Histfit

Obviously it is really inefficient compared with the Matlab built-in randn.


The Ziggurat algorithm is pretty efficient for this, although the Box-Muller transform is easier to implement from scratch (and not crazy slow).


Where R1, R2 are random uniform numbers:

NORMAL DISTRIBUTION, with SD of 1:

sqrt(-2*log(R1))*cos(2*pi*R2)

This is exact... no need to do all those slow loops!

Reference: dspguide.com/ch2/6.htm


There are plenty of methods:

  • Do not use Box Muller. Especially if you draw many gaussian numbers. Box Muller yields a result which is clamped between -6 and 6 (assuming double precision. Things worsen with floats.). And it is really less efficient than other available methods.
  • Ziggurat is fine, but needs a table lookup (and some platform-specific tweaking due to cache size issues)
  • Ratio-of-uniforms is my favorite, only a few addition/multiplications and a log 1/50th of the time (eg. look there).
  • Inverting the CDF is efficient (and overlooked, why ?), you have fast implementations of it available if you search google. It is mandatory for Quasi-Random numbers.

Here is a javascript implementation using the polar form of the Box-Muller transformation.

/*
 * Returns member of set with a given mean and standard deviation
 * mean: mean
 * standard deviation: std_dev 
 */
function createMemberInNormalDistribution(mean,std_dev){
    return mean + (gaussRandom()*std_dev);
}

/*
 * Returns random number in normal distribution centering on 0.
 * ~95% of numbers returned should fall between -2 and 2
 * ie within two standard deviations
 */
function gaussRandom() {
    var u = 2*Math.random()-1;
    var v = 2*Math.random()-1;
    var r = u*u + v*v;
    /*if outside interval [0,1] start over*/
    if(r == 0 || r >= 1) return gaussRandom();

    var c = Math.sqrt(-2*Math.log(r)/r);
    return u*c;

    /* todo: optimize this algorithm by caching (v*c) 
     * and returning next time gaussRandom() is called.
     * left out for simplicity */
}

function distRandom(){
  do{
    x=random(DISTRIBUTION_DOMAIN);
  }while(random(DISTRIBUTION_RANGE)>=distributionFunction(x));
  return x;
}

This is a Matlab implementation using the polar form of the Box-Muller transformation:

Function randn_box_muller.m:

function [values] = randn_box_muller(n, mean, std_dev)
    if nargin == 1
       mean = 0;
       std_dev = 1;
    end

    r = gaussRandomN(n);
    values = r.*std_dev - mean;
end

function [values] = gaussRandomN(n)
    [u, v, r] = gaussRandomNValid(n);

    c = sqrt(-2*log(r)./r);
    values = u.*c;
end

function [u, v, r] = gaussRandomNValid(n)
    r = zeros(n, 1);
    u = zeros(n, 1);
    v = zeros(n, 1);

    filter = r==0 | r>=1;

    % if outside interval [0,1] start over
    while n ~= 0
        u(filter) = 2*rand(n, 1)-1;
        v(filter) = 2*rand(n, 1)-1;
        r(filter) = u(filter).*u(filter) + v(filter).*v(filter);

        filter = r==0 | r>=1;
        n = size(r(filter),1);
    end
end

And invoking histfit(randn_box_muller(10000000),100); this is the result: Box-Muller Matlab Histfit

Obviously it is really inefficient compared with the Matlab built-in randn.


The Ziggurat algorithm is pretty efficient for this, although the Box-Muller transform is easier to implement from scratch (and not crazy slow).


Examples related to algorithm

How can I tell if an algorithm is efficient? Find the smallest positive integer that does not occur in a given sequence Efficiently getting all divisors of a given number Peak signal detection in realtime timeseries data What is the optimal algorithm for the game 2048? How can I sort a std::map first by value, then by key? Finding square root without using sqrt function? Fastest way to flatten / un-flatten nested JSON objects Mergesort with Python Find common substring between two strings

Examples related to language-agnostic

IOException: The process cannot access the file 'file path' because it is being used by another process Peak signal detection in realtime timeseries data Match linebreaks - \n or \r\n? Simple way to understand Encapsulation and Abstraction How can I pair socks from a pile efficiently? How do I determine whether my calculation of pi is accurate? What is ADT? (Abstract Data Type) How to explain callbacks in plain english? How are they different from calling one function from another function? Ukkonen's suffix tree algorithm in plain English Private vs Protected - Visibility Good-Practice Concern

Examples related to random

How can I get a random number in Kotlin? scikit-learn random state in splitting dataset Random number between 0 and 1 in python In python, what is the difference between random.uniform() and random.random()? Generate random colors (RGB) Random state (Pseudo-random number) in Scikit learn How does one generate a random number in Apple's Swift language? How to generate a random string of a fixed length in Go? Generate 'n' unique random numbers within a range What does random.sample() method in python do?

Examples related to normal-distribution

How to calculate the inverse of the normal cumulative distribution function in python? Perform a Shapiro-Wilk Normality Test Seeing if data is normally distributed in R Generating random numbers with normal distribution in Excel Generate random numbers following a normal distribution in C/C++ Converting a Uniform Distribution to a Normal Distribution