[algorithm] Expand a random range from 1–5 to 1–7

Given a function which produces a random integer in the range 1 to 5, write a function which produces a random integer in the range 1 to 7.

  1. What is a simple solution?
  2. What is an effective solution to reduce memory usage or run on a slower CPU?

This question is related to algorithm random puzzle

The answer is


in php

function rand1to7() {
    do {
        $output_value = 0;
        for ($i = 0; $i < 28; $i++) {
            $output_value += rand1to5();
        }
    while ($output_value != 140);
    $output_value -= 12;
    return floor($output_value / 16);
}

loops to produce a random number between 16 and 127, divides by sixteen to create a float between 1 and 7.9375, then rounds down to get an int between 1 and 7. if I am not mistaken, there is a 16/112 chance of getting any one of the 7 outcomes.


  1. What is a simple solution? (rand5() + rand5()) % 7 + 1
  2. What is an effective solution to reduce memory usage or run on a slower CPU? Yes, this is effective as it calls rand5() only twice and have O(1) space complexity

Consider rand5() gives out random numbers from 1 to 5(inclusive).
(1 + 1) % 7 + 1 = 3
(1 + 2) % 7 + 1 = 4
(1 + 3) % 7 + 1 = 5
(1 + 4) % 7 + 1 = 6
(1 + 5) % 7 + 1 = 7

(2 + 1) % 7 + 1 = 4
(2 + 2) % 7 + 1 = 5
(2 + 3) % 7 + 1 = 6
(2 + 4) % 7 + 1 = 7
(2 + 5) % 7 + 1 = 1
...

(5 + 1) % 7 + 1 = 7
(5 + 2) % 7 + 1 = 1
(5 + 3) % 7 + 1 = 2
(5 + 4) % 7 + 1 = 3
(5 + 5) % 7 + 1 = 4
...

and so on


Are homework problems allowed here?

This function does crude "base 5" math to generate a number between 0 and 6.

function rnd7() {
    do {
        r1 = rnd5() - 1;
        do {
            r2=rnd5() - 1;
        } while (r2 > 1);
        result = r2 * 5 + r1;
    } while (result > 6);
    return result + 1;
}

Would be cool if someone could give me feedback on this one, I used the JUNIT without assert Pattern because it's easy and fast to get it running in Eclipse, I could also have just defined a main method. By the way, I am assuming rand5 gives values 0-4, adding 1 would make it 1-5, same with rand7... So the discussion should be on the solution, it's distribution, not on wether it goes from 0-4 or 1-5...

package random;

import java.util.Random;

import org.junit.Test;

public class RandomTest {


    @Test
    public void testName() throws Exception {
        long times = 100000000;
        int indexes[] = new int[7];
        for(int i = 0; i < times; i++) {
            int rand7 = rand7();
            indexes[rand7]++;
        }

        for(int i = 0; i < 7; i++)
            System.out.println("Value " + i + ": " + indexes[i]);
    }


    public int rand7() {
        return (rand5() + rand5() + rand5() + rand5() + rand5() + rand5() + rand5()) % 7;
    }


    public int rand5() {
        return new Random().nextInt(5);
    }


}

When I run it, I get this result:

Value 0: 14308087
Value 1: 14298303
Value 2: 14279731
Value 3: 14262533
Value 4: 14269749
Value 5: 14277560
Value 6: 14304037

This seems like a very fair distribution, doesn't it?

If I add rand5() less or more times (where the amount of times is not divisible by 7), the distribution clearly shows offsets. For instance, adding rand5() 3 times:

Value 0: 15199685
Value 1: 14402429
Value 2: 12795649
Value 3: 12796957
Value 4: 14402252
Value 5: 15202778
Value 6: 15200250

So, this would lead to the following:

public int rand(int range) {
    int randomValue = 0;
    for(int i = 0; i < range; i++) {
        randomValue += rand5();
    }
    return randomValue % range;

}

And then, I could go further:

public static final int ORIGN_RANGE = 5;
public static final int DEST_RANGE  = 7;

@Test
public void testName() throws Exception {
    long times = 100000000;
    int indexes[] = new int[DEST_RANGE];
    for(int i = 0; i < times; i++) {
        int rand7 = convertRand(DEST_RANGE, ORIGN_RANGE);
        indexes[rand7]++;
    }

    for(int i = 0; i < DEST_RANGE; i++)
        System.out.println("Value " + i + ": " + indexes[i]);
}


public int convertRand(int destRange, int originRange) {
    int randomValue = 0;
    for(int i = 0; i < destRange; i++) {
        randomValue += rand(originRange);
    }
    return randomValue % destRange;

}


public int rand(int range) {
    return new Random().nextInt(range);
}

I tried this replacing the destRange and originRange with various values (even 7 for ORIGIN and 13 for DEST), and I get this distribution:

Value 0: 7713763
Value 1: 7706552
Value 2: 7694697
Value 3: 7695319
Value 4: 7688617
Value 5: 7681691
Value 6: 7674798
Value 7: 7680348
Value 8: 7685286
Value 9: 7683943
Value 10: 7690283
Value 11: 7699142
Value 12: 7705561

What I can conclude from here is that you can change any random to anyother by suming the origin random "destination" times. This will get a kind of gaussian distribution (being the middle values more likely, and the edge values more uncommon). However, the modulus of destination seems to distribute itself evenly across this gaussian distribution... It would be great to have feedback from a mathematician...

What is cool is that the cost is 100% predictable and constant, whereas other solutions cause a small probability of infinite loop...


This algorithm reduces the number of calls of rand5 to the theoretical minimum of 7/5. Calling it 7 times by produce the next 5 rand7 numbers.

There are no rejection of any random bit, and there are NO possibility to keep waiting the result for always.

#!/usr/bin/env ruby

# random integer from 1 to 5
def rand5
    STDERR.putc '.'
    1 + rand( 5 )
end

@bucket = 0
@bucket_size = 0

# random integer from 1 to 7
def rand7
    if @bucket_size == 0
        @bucket = 7.times.collect{ |d| rand5 * 5**d }.reduce( &:+ )
        @bucket_size = 5
    end

    next_rand7 = @bucket%7 + 1

    @bucket      /= 7
    @bucket_size -= 1

    return next_rand7
end

35.times.each{ putc rand7.to_s }

Here's my answer:

static struct rand_buffer {
  unsigned v, count;
} buf2, buf3;

void push (struct rand_buffer *buf, unsigned n, unsigned v)
{
  buf->v = buf->v * n + v;
  ++buf->count;
}

#define PUSH(n, v)  push (&buf##n, n, v)

int rand16 (void)
{
  int v = buf2.v & 0xf;
  buf2.v >>= 4;
  buf2.count -= 4;
  return v;
}

int rand9 (void)
{
  int v = buf3.v % 9;
  buf3.v /= 9;
  buf3.count -= 2;
  return v;
}

int rand7 (void)
{
  if (buf3.count >= 2) {
    int v = rand9 ();

    if (v < 7)
      return v % 7 + 1;

    PUSH (2, v - 7);
  }

  for (;;) {
    if (buf2.count >= 4) {
      int v = rand16 ();

      if (v < 14) {
        PUSH (2, v / 7);
        return v % 7 + 1;
      }

      PUSH (2, v - 14);
    }

    // Get a number between 0 & 25
    int v = 5 * (rand5 () - 1) + rand5 () - 1;

    if (v < 21) {
      PUSH (3, v / 7);
      return v % 7 + 1;
    }

    v -= 21;
    PUSH (2, v & 1);
    PUSH (2, v >> 1);
  }
}

It's a little more complicated than others, but I believe it minimises the calls to rand5. As with other solutions, there's a small probability that it could loop for a long time.


The following produces a uniform distribution on {1, 2, 3, 4, 5, 6, 7} using a random number generator producing a uniform distribution on {1, 2, 3, 4, 5}. The code is messy, but the logic is clear.

public static int random_7(Random rg) {
    int returnValue = 0;
    while (returnValue == 0) {
        for (int i = 1; i <= 3; i++) {
            returnValue = (returnValue << 1) + SimulateFairCoin(rg);
        }
    }
    return returnValue;
}

private static int SimulateFairCoin(Random rg) {
    while (true) {
        int flipOne = random_5_mod_2(rg);
        int flipTwo = random_5_mod_2(rg);

        if (flipOne == 0 && flipTwo == 1) {
            return 0;
        }
        else if (flipOne == 1 && flipTwo == 0) {
            return 1;
        }
    }
}

private static int random_5_mod_2(Random rg) {
    return random_5(rg) % 2;
}

private static int random_5(Random rg) {
    return rg.Next(5) + 1;
}    

First, I move ramdom5() on the 1 point 6 times, to get 7 random numbers. Second, I add 7 numbers to obtain common sum. Third, I get remainder of the division at 7. Last, I add 1 to get results from 1 till 7. This method gives an equal probability of getting numbers in the range from 1 to 7, with the exception of 1. 1 has a slightly higher probability.

public int random7(){
    Random random = new Random();
    //function (1 + random.nextInt(5)) is given
    int random1_5 = 1 + random.nextInt(5); // 1,2,3,4,5
    int random2_6 = 2 + random.nextInt(5); // 2,3,4,5,6
    int random3_7 = 3 + random.nextInt(5); // 3,4,5,6,7
    int random4_8 = 4 + random.nextInt(5); // 4,5,6,7,8
    int random5_9 = 5 + random.nextInt(5); // 5,6,7,8,9
    int random6_10 = 6 + random.nextInt(5); //6,7,8,9,10
    int random7_11 = 7 + random.nextInt(5); //7,8,9,10,11

    //sumOfRandoms is between 28 and 56
    int sumOfRandoms = random1_5 + random2_6 + random3_7 + 
                       random4_8 + random5_9 + random6_10 + random7_11;
    //result is number between 0 and 6, and
    //equals 0 if sumOfRandoms = 28 or 35 or 42 or 49 or 56 , 5 options
    //equals 1 if sumOfRandoms = 29 or 36 or 43 or 50, 4 options
    //equals 2 if sumOfRandoms = 30 or 37 or 44 or 51, 4 options
    //equals 3 if sumOfRandoms = 31 or 38 or 45 or 52, 4 options
    //equals 4 if sumOfRandoms = 32 or 39 or 46 or 53, 4 options
    //equals 5 if sumOfRandoms = 33 or 40 or 47 or 54, 4 options
    //equals 6 if sumOfRandoms = 34 or 41 or 48 or 55, 4 options
    //It means that the probabilities of getting numbers between 0 and 6 are almost equal.
    int result = sumOfRandoms % 7;
    //we should add 1 to move the interval [0,6] to the interval [1,7]
    return 1 + result;
}

solution in php

<?php
function random_5(){
    return rand(1,5);
}


function random_7(){
 $total = 0;

    for($i=0;$i<7;$i++){
        $total += random_5();
    }

    return ($total%7)+1; 
}

echo random_7();
?>

This expression is sufficient to get random integers between 1 - 7

int j = ( rand5()*2 + 4 ) % 7 + 1;

For the range [1, 5] to [1, 7], this is equivalent to rolling a 7-sided die with a 5-sided one.

However, this can't be done without "wasting" randomness (or running forever in the worst case), since all the prime factors of 7 (namely 7) don't divide 5. Thus, the best that can be done is to use rejection sampling to get arbitrarily close to no "waste" of randomness (such as by batching multiple rolls of the 5-sided die until 5^n is "close enough" to a power of 7). Solutions to this problem were already given in other answers.

More generally, an algorithm to roll a k-sided die with a p-sided die will inevitably "waste" randomness (and run forever in the worst case) unless "every prime number dividing k also divides p", according to Lemma 3 in "Simulating a dice with a dice" by B. Kloeckner. For example, take the much more practical case that p is a power of 2 and k is arbitrary. In this case, this "waste" and indefinite running time are inevitable unless k is also a power of 2.


I have played around and I write "testing environment" for this Rand(7) algorithm. For example if you want to try what distribution gives your algorithm or how much iterations takes to generate all distinct random values (for Rand(7) 1-7), you can use it.

My core algorithm is this:

return (Rand5() + Rand5()) % 7 + 1;

Well is no less uniformly distributed then Adam Rosenfield's one. (which I included in my snippet code)

private static int Rand7WithRand5()
{
    //PUT YOU FAVOURITE ALGORITHM HERE//

    //1. Stackoverflow winner
    int i;
    do
    {
        i = 5 * (Rand5() - 1) + Rand5(); // i is now uniformly random between 1 and 25
    } while (i > 21);
    // i is now uniformly random between 1 and 21
    return i % 7 + 1;

    //My 2 cents
    //return (Rand5() + Rand5()) % 7 + 1;
}

This "testing environment" can take any Rand(n) algorithm and test and evaluate it (distribution and speed). Just put your code into the "Rand7WithRand5" method and run the snippet.

Few observations:

  • Adam Rosenfield's algorithm is no better distributed then, for example, mine. Anyway, both algorithms distribution is horrible.
  • Native Rand7 (random.Next(1, 8)) is completed as it generated all members in given interval in around 200+ iterations, Rand7WithRand5 algorithms take order of 10k (around 30-70k)
  • Real challenge is not to write a method to generate Rand(7) from Rand(5), but it generate values more or less uniformly distributed.

This is the simplest answer I could create after reviewing others' answers:

def r5tor7():
    while True:
        cand = (5 * r5()) + r5()
        if cand < 27:
            return cand

cand is in the range [6, 27] and the possible outcomes are evenly distributed if the possible outcomes from r5() are evenly distributed. You can test my answer with this code:

from collections import defaultdict

def r5_outcome(n):
    if not n:
        yield []
    else:
        for i in range(1, 6):
            for j in r5_outcome(n-1):
                yield [i] + j

def test_r7():
    d = defaultdict(int)
    for x in r5_outcome(2):
        s = sum([x[i] * 5**i for i in range(len(x))])
        if s < 27:
            d[s] += 1
    print len(d), d

r5_outcome(2) generates all possible combinations of r5() results. I use the same filter to test as in my solution code. You can see that all of the outcomes are equally probably because they have the same value.


Another answer which appears to have not been covered here:

int rand7() {
  int r = 7 / 2;
  for (int i = 0; i < 28; i++)
    r = ((rand5() - 1) * 7 + r) / 5;
  return r + 1;
}

On every iteration r is a random value between 0 and 6 inclusive. This is appended (base 7) to a random value between 0 and 4 inclusive, and the result is divided by 5, giving a new random value in the range of 0 to 6 inclusive. r starts with a substantial bias (r = 3 is very biased!) but each iteration divides that bias by 5.

This method is not perfectly uniform; however, the bias is vanishingly small. Something in the order of 1/(2**64). What's important about this approach is that it has constant execution time (assuming rand5() also has constant execution time). No theoretical concerns that an unlucky call could iterate forever picking bad values.


Also, a sarcastic answer for good measure (deliberately or not, it has been covered):

1-5 is already within the range 1-7, therefore the following is a valid implementation:

int rand7() {
  return rand5();
}

Question did not ask for uniform distribution.


This solution was inspired by Rob McAfee.
However it doesn't need a loop and the result is a uniform distribution:

// Returns 1-5
var rnd5 = function(){
   return parseInt(Math.random() * 5, 10) + 1;
}
// Helper
var lastEdge = 0;
// Returns 1-7
var rnd7 = function () {
  var map = [
     [ 1, 2, 3, 4, 5 ],
     [ 6, 7, 1, 2, 3 ],
     [ 4, 5, 6, 7, 1 ],
     [ 2, 3, 4, 5, 6 ],
     [ 7, 0, 0, 0, 0 ]
  ];
  var result = map[rnd5() - 1][rnd5() - 1];
  if (result > 0) {
    return result;
  }
  lastEdge++;
  if (lastEdge > 7 ) {
    lastEdge = 1;
  }
  return lastEdge;
};

// Test the a uniform distribution
results = {}; for(i=0; i < 700000;i++) { var rand = rnd7(); results[rand] = results[rand] ? results[rand] + 1 : 1;} 
console.log(results)

Result: [1: 99560, 2: 99932, 3: 100355, 4: 100262, 5: 99603, 6: 100062, 7: 100226]

jsFiddle


int ans = 0;
while (ans == 0) 
{
     for (int i=0; i<3; i++) 
     {
          while ((r = rand5()) == 3){};
          ans += (r < 3) >> i
     }
}

function Rand7
   put 200 into x
   repeat while x > 118
      put ((random(5)-1) * 25) + ((random(5)-1) * 5) + (random(5)-1) into x
   end repeat
   return (x mod 7) + 1
end Rand7

Three calls to Rand5, which only repeats 6 times out of 125, on average.

Think of it as a 3D array, 5x5x5, filled with 1 to 7 over and over, and 6 blanks. Re-roll on the blanks. The rand5 calls create a three digit base-5 index into that array.

There would be fewer repeats with a 4D, or higher N-dimensional arrays, but this means more calls to the rand5 function become standard. You'll start to get diminishing efficiency returns at higher dimensions. Three seems to me to be a good compromise, but I haven't tested them against each other to be sure. And it would be rand5-implementation specific.


I thought of an interesting solution to this problem and wanted to share it.

function rand7() {

    var returnVal = 4;

    for (var n=0; n<3; n++) {
        var rand = rand5();

        if (rand==1||rand==2){
            returnVal+=1;
        }
        else if (rand==3||rand==4) {
            returnVal-=1;
        }
    }

    return returnVal;
}

I built a test function that loops through rand7() 10,000 times, sums up all of the return values, and divides it by 10,000. If rand7() is working correctly, our calculated average should be 4 - for example, (1+2+3+4+5+6+7 / 7) = 4. After doing multiple tests, the average is indeed right at 4 :)


rand25() =5*(rand5()-1) + rand5()

rand7() { 
   while(true) {
       int r = rand25();
       if (r < 21) return r%3;         
   }
}

Why this works: probability that the loop will run forever is 0.


rand5() => [1-5]

rand5()%2 => [0-2]

rand5() + rand5()%2 => [1-7]

rand7() {
     return rand5()+rand5()%2;
}

The following produces a uniform distribution on {1, 2, 3, 4, 5, 6, 7} using a random number generator producing a uniform distribution on {1, 2, 3, 4, 5}. The code is messy, but the logic is clear.

public static int random_7(Random rg) {
    int returnValue = 0;
    while (returnValue == 0) {
        for (int i = 1; i <= 3; i++) {
            returnValue = (returnValue << 1) + SimulateFairCoin(rg);
        }
    }
    return returnValue;
}

private static int SimulateFairCoin(Random rg) {
    while (true) {
        int flipOne = random_5_mod_2(rg);
        int flipTwo = random_5_mod_2(rg);

        if (flipOne == 0 && flipTwo == 1) {
            return 0;
        }
        else if (flipOne == 1 && flipTwo == 0) {
            return 1;
        }
    }
}

private static int random_5_mod_2(Random rg) {
    return random_5(rg) % 2;
}

private static int random_5(Random rg) {
    return rg.Next(5) + 1;
}    

If we consider the additional constraint of trying to give the most efficient answer i.e one that given an input stream, I, of uniformly distributed integers of length m from 1-5 outputs a stream O, of uniformly distributed integers from 1-7 of the longest length relative to m, say L(m).

The simplest way to analyse this is to treat the streams I and O as 5-ary and 7-ary numbers respectively. This is achieved by the main answer's idea of taking the stream a1, a2, a3,... -> a1+5*a2+5^2*a3+.. and similarly for stream O.

Then if we take a section of the input stream of length m choose n s.t. 5^m-7^n=c where c>0 and is as small as possible. Then there is a uniform map from the input stream of length m to integers from 1 to 5^m and another uniform map from integers from 1 to 7^n to the output stream of length n where we may have to lose a few cases from the input stream when the mapped integer exceeds 7^n.

So this gives a value for L(m) of around m (log5/log7) which is approximately .82m.

The difficulty with the above analysis is the equation 5^m-7^n=c which is not easy to solve exactly and the case where the uniform value from 1 to 5^m exceeds 7^n and we lose efficiency.

The question is how close to the best possible value of m (log5/log7) can be attain. For example when this number approaches close to an integer can we find a way to achieve this exact integral number of output values?

If 5^m-7^n=c then from the input stream we effectively generate a uniform random number from 0 to (5^m)-1 and don't use any values higher than 7^n. However these values can be rescued and used again. They effectively generate a uniform sequence of numbers from 1 to 5^m-7^n. So we can then try to use these and convert them into 7-ary numbers so that we can create more output values.

If we let T7(X) to be the average length of the output sequence of random(1-7) integers derived from a uniform input of size X, and assuming that 5^m=7^n0+7^n1+7^n2+...+7^nr+s, s<7.

Then T7(5^m)=n0x7^n0/5^m + ((5^m-7^n0)/5^m) T7(5^m-7^n0) since we have a length no sequence with probability 7^n0/5^m with a residual of length 5^m-7^n0 with probability (5^m-7^n0)/5^m).

If we just keep substituting we obtain:

T7(5^m) = n0x7^n0/5^m + n1x7^n1/5^m + ... + nrx7^nr/5^m  = (n0x7^n0 + n1x7^n1 + ... + nrx7^nr)/5^m

Hence

L(m)=T7(5^m)=(n0x7^n0 + n1x7^n1 + ... + nrx7^nr)/(7^n0+7^n1+7^n2+...+7^nr+s)

Another way of putting this is:

If 5^m has 7-ary representation `a0+a1*7 + a2*7^2 + a3*7^3+...+ar*7^r
Then L(m) = (a1*7 + 2a2*7^2 + 3a3*7^3+...+rar*7^r)/(a0+a1*7 + a2*7^2 + a3*7^3+...+ar*7^r)

The best possible case is my original one above where 5^m=7^n+s, where s<7.

Then T7(5^m) = nx(7^n)/(7^n+s) = n+o(1) = m (Log5/Log7)+o(1) as before.

The worst case is when we can only find k and s.t 5^m = kx7+s.

Then T7(5^m) = 1x(k.7)/(k.7+s) = 1+o(1)

Other cases are somewhere inbetween. It would be interesting to see how well we can do for very large m, i.e. how good can we get the error term:

T7(5^m) = m (Log5/Log7)+e(m)

It seems impossible to achieve e(m) = o(1) in general but hopefully we can prove e(m)=o(m).

The whole thing then rests on the distribution of the 7-ary digits of 5^m for various values of m.

I'm sure there is a lot of theory out there that covers this I may have a look and report back at some point.


Here's mine, this attempts to recreate Math.random() from multiple rand5() function calls, reconstructing a unit interval (the output range of Math.random()) by re-constructing it with "weighted fractions"(?). Then using this random unit interval to produce a random integer between 1 and 7:

function rand5(){
  return Math.floor(Math.random()*5)+1;
}
function rand7(){
  var uiRandom=0;
  var div=1;
  for(var i=0; i<7; i++){
    div*=5;
    var term=(rand5()-1)/div;
    uiRandom+=term;
  }
  //return uiRandom;
  return Math.floor(uiRandom*7)+1; 
}

To paraphrase: We take a random integers between 0-4 (just rand5()-1) and multiply each result with 1/5, 1/25, 1/125, ... and then sum them together. It's similar to how binary weighted fractions work; I suppose instead, we'll call it a quinary (base-5) weighted fraction: Producing a number from 0 -- 0.999999 as a series of (1/5)^n terms.

Modifying the function to take any input/output random integer range should be trivial. And the code above can be optimized when rewritten as a closure.


Alternatively, we can also do this:

function rand5(){
  return Math.floor(Math.random()*5)+1;
}
function rand7(){
  var buffer=[];
  var div=1;
  for (var i=0; i<7; i++){
    buffer.push((rand5()-1).toString(5));
    div*=5;
  }
  var n=parseInt(buffer.join(""),5);
  var uiRandom=n/div;
  //return uiRandom;
  return Math.floor(uiRandom*7)+1; 
}

Instead of fiddling with constructing a quinary (base-5) weighted fractions, we'll actually make a quinary number and turn it into a fraction (0--0.9999... as before), then compute our random 1--7 digit from there.

Results for above (code snippet #2: 3 runs of 100,000 calls each):

1: 14263; 2: 14414; 3: 14249; 4: 14109; 5: 14217; 6: 14361; 7: 14387

1: 14205; 2: 14394; 3: 14238; 4: 14187; 5: 14384; 6: 14224; 7: 14368

1: 14425; 2: 14236; 3: 14334; 4: 14232; 5: 14160; 6: 14320; 7: 14293


A constant time solution that produces approximately uniform distribution. The trick is 625 happens to be cleanly divisible by 7 and you can get uniform distributions as you build up to that range.

Edit: My bad, I miscalculated, but instead of pulling it I'll leave it in case someone finds it useful/entertaining. It does actually work after all... :)

int rand5()
{
    return (rand() % 5) + 1;
}

int rand25()
{ 
    return (5 * (rand5() - 1) + rand5());
}

int rand625()
{
    return (25 * (rand25() - 1) + rand25());
}

int rand7()
{
    return ((625 * (rand625() - 1) + rand625()) - 1) % 7 + 1;
}

Here is a solution that tries to minimize the number of calls to rand5() while keeping the implementation simple and efficient; in particular, it does not require arbitrary large integers unlike Adam Rosenfield’s second answer. It exploits the fact that 23/19 = 1.21052... is a good rational approximation to log(7)/log(5) = 1.20906..., thus we can generate 19 random elements of {1,...,7} out of 23 random elements of {1,...,5} by rejection sampling with only a small rejection probability. On average, the algorithm below takes about 1.266 calls to rand5() for each call to rand7(). If the distribution of rand5() is uniform, so is rand7().

uint_fast64_t pool;

int capacity = 0;

void new_batch (void)
{
  uint_fast64_t r;
  int i;

  do {
    r = 0;
    for (i = 0; i < 23; i++)
      r = 5 * r + (rand5() - 1);
  } while (r >= 11398895185373143ULL);  /* 7**19, a bit less than 5**23 */

  pool = r;
  capacity = 19;
}

int rand7 (void)
{
  int r;

  if (capacity == 0)
    new_batch();

  r = pool % 7;
  pool /= 7;
  capacity--;

  return r + 1;
}

I feel stupid in front of all this complicated answsers.

Why can't it be :

int random1_to_7()
{
  return (random1_to_5() * 7) / 5;  
}

?


There are elegant algorithms cited above, but here's one way to approach it, although it might be roundabout. I am assuming values generated from 0.

R2 = random number generator giving values less than 2 (sample space = {0, 1})
R8 = random number generator giving values less than 8 (sample space = {0, 1, 2, 3, 4, 5, 6, 7})

In order to generate R8 from R2, you will run R2 thrice, and use the combined result of all 3 runs as a binary number with 3 digits. Here are the range of values when R2 is ran thrice:

0 0 0 --> 0
.
.
1 1 1 --> 7

Now to generate R7 from R8, we simply run R7 again if it returns 7:

int R7() {
  do {
    x = R8();
  } while (x > 6)
  return x;
}

The roundabout solution is to generate R2 from R5 (just like we generated R7 from R8), then R8 from R2 and then R7 from R8.


Here's what I've found:

  1. Random5 produces a range from 1~5, randomly distributed
  2. If we run it 3 times and add them together we'll get a range of 3~15, randomly distributed
  3. Perform arithmetic on the 3~15 range
    1. (3~15) - 1 = (2~14)
    2. (2~14)/2 = (1~7)

Then we get a range of 1~7, which is the Random7 we're looking for.


By using a rolling total, you can both

  • maintain an equal distribution; and
  • not have to sacrifice any element in the random sequence.

Both these problems are an issue with the simplistic rand(5)+rand(5)...-type solutions. The following Python code shows how to implement it (most of this is proving the distribution).

import random
x = []
for i in range (0,7):
    x.append (0)
t = 0
tt = 0
for i in range (0,700000):
    ########################################
    #####            qq.py             #####
    r = int (random.random () * 5)
    t = (t + r) % 7
    ########################################
    #####       qq_notsogood.py        #####
    #r = 20
    #while r > 6:
        #r =     int (random.random () * 5)
        #r = r + int (random.random () * 5)
    #t = r
    ########################################
    x[t] = x[t] + 1
    tt = tt + 1
high = x[0]
low = x[0]
for i in range (0,7):
    print "%d: %7d %.5f" % (i, x[i], 100.0 * x[i] / tt)
    if x[i] < low:
        low = x[i]
    if x[i] > high:
        high = x[i]
diff = high - low
print "Variation = %d (%.5f%%)" % (diff, 100.0 * diff / tt)

And this output shows the results:

pax$ python qq.py
0:   99908 14.27257
1:  100029 14.28986
2:  100327 14.33243
3:  100395 14.34214
4:   99104 14.15771
5:   99829 14.26129
6:  100408 14.34400
Variation = 1304 (0.18629%)

pax$ python qq.py
0:   99547 14.22100
1:  100229 14.31843
2:  100078 14.29686
3:   99451 14.20729
4:  100284 14.32629
5:  100038 14.29114
6:  100373 14.33900
Variation = 922 (0.13171%)

pax$ python qq.py
0:  100481 14.35443
1:   99188 14.16971
2:  100284 14.32629
3:  100222 14.31743
4:   99960 14.28000
5:   99426 14.20371
6:  100439 14.34843
Variation = 1293 (0.18471%)

A simplistic rand(5)+rand(5), ignoring those cases where this returns more than 6 has a typical variation of 18%, 100 times that of the method shown above:

pax$ python qq_notsogood.py
0:   31756 4.53657
1:   63304 9.04343
2:   95507 13.64386
3:  127825 18.26071
4:  158851 22.69300
5:  127567 18.22386
6:   95190 13.59857
Variation = 127095 (18.15643%)

pax$ python qq_notsogood.py
0:   31792 4.54171
1:   63637 9.09100
2:   95641 13.66300
3:  127627 18.23243
4:  158751 22.67871
5:  126782 18.11171
6:   95770 13.68143
Variation = 126959 (18.13700%)

pax$ python qq_notsogood.py
0:   31955 4.56500
1:   63485 9.06929
2:   94849 13.54986
3:  127737 18.24814
4:  159687 22.81243
5:  127391 18.19871
6:   94896 13.55657
Variation = 127732 (18.24743%)

And, on the advice of Nixuz, I've cleaned the script up so you can just extract and use the rand7... stuff:

import random

# rand5() returns 0 through 4 inclusive.

def rand5():
    return int (random.random () * 5)

# rand7() generator returns 0 through 6 inclusive (using rand5()).

def rand7():
    rand7ret = 0
    while True:
        rand7ret = (rand7ret + rand5()) % 7
        yield rand7ret

# Number of test runs.

count = 700000

# Work out distribution.

distrib = [0,0,0,0,0,0,0]
rgen =rand7()
for i in range (0,count):
    r = rgen.next()
    distrib[r] = distrib[r] + 1

# Print distributions and calculate variation.

high = distrib[0]
low = distrib[0]
for i in range (0,7):
    print "%d: %7d %.5f" % (i, distrib[i], 100.0 * distrib[i] / count)
    if distrib[i] < low:
        low = distrib[i]
    if distrib[i] > high:
        high = distrib[i]
diff = high - low
print "Variation = %d (%.5f%%)" % (diff, 100.0 * diff / count)

Algorithm:

7 can be represented in a sequence of 3 bits

Use rand(5) to randomly fill each bit with 0 or 1.
For e.g: call rand(5) and

if the result is 1 or 2, fill the bit with 0
if the result is 4 or 5, fill the bit with 1
if the result is 3 , then ignore and do it again (rejection)

This way we can fill 3 bits randomly with 0/1 and thus get a number from 1-7.

EDIT: This seems like the simplest and most efficient answer, so here's some code for it:

public static int random_7() {
    int returnValue = 0;
    while (returnValue == 0) {
        for (int i = 1; i <= 3; i++) {
            returnValue = (returnValue << 1) + random_5_output_2();
        }
    }
    return returnValue;
}

private static int random_5_output_2() {
    while (true) {
        int flip = random_5();

        if (flip < 3) {
            return 0;
        }
        else if (flip > 3) {
            return 1;
        }
    }
}

_x000D_
_x000D_
// returns random number between 0-5 with equal probability_x000D_
function rand5() {_x000D_
  return Math.floor(Math.random() * 6);_x000D_
}_x000D_
_x000D_
// returns random number between 0-7 with equal probability_x000D_
function rand7() {_x000D_
  if(rand5() % 2 == 0 && rand5() % 2 == 0) { _x000D_
    return 6 + rand5() % 2;_x000D_
  } else {_x000D_
    return rand5();_x000D_
  }_x000D_
}_x000D_
_x000D_
console.log(rand7());
_x000D_
_x000D_
_x000D_


There are a lot of solutions here that do not produce a uniform distribution and many comments pointing that out, but the the question does not state that as a requirement. The simplest solution is:

int rand_7() { return rand_5(); }

A random integer in the range 1 - 5 is clearly in the range 1 - 7. Well, technically, the simplest solution is to return a constant, but that's too trivial.

However, I think the existence of the rand_5 function is a red herring. Suppose the question was asked as "produce a uniformly distributed pseudo-random number generator with integer output in the range 1 - 7". That's a simple problem (not technically simple, but already solved, so you can look it up.)

On the other hand, if the question is interpreted to mean that you actually have a truly random number generator for integers in the range 1 - 5 (not pseudo random), then the solution is:

1) examine the rand_5 function
2) understand how it works
3) profit

just scale your output from your first function

0) you have a number in range 1-5
1) subtract 1 to make it in range 0-4
2) multiply by (7-1)/(5-1) to make it in range 0-6
3) add 1 to increment the range: Now your result is in between 1-7

Here's a solution that fits entirely within integers and is within about 4% of optimal (i.e. uses 1.26 random numbers in {0..4} for every one in {0..6}). The code's in Scala, but the math should be reasonably clear in any language: you take advantage of the fact that 7^9 + 7^8 is very close to 5^11. So you pick an 11 digit number in base 5, and then interpret it as a 9 digit number in base 7 if it's in range (giving 9 base 7 numbers), or as an 8 digit number if it's over the 9 digit number, etc.:

abstract class RNG {
  def apply(): Int
}

class Random5 extends RNG {
  val rng = new scala.util.Random
  var count = 0
  def apply() = { count += 1 ; rng.nextInt(5) }
}

class FiveSevener(five: RNG) {
  val sevens = new Array[Int](9)
  var nsevens = 0
  val to9 = 40353607;
  val to8 = 5764801;
  val to7 = 823543;
  def loadSevens(value: Int, count: Int) {
    nsevens = 0;
    var remaining = value;
    while (nsevens < count) {
      sevens(nsevens) = remaining % 7
      remaining /= 7
      nsevens += 1
    }
  }
  def loadSevens {
    var fivepow11 = 0;
    var i=0
    while (i<11) { i+=1 ; fivepow11 = five() + fivepow11*5 }
    if (fivepow11 < to9) { loadSevens(fivepow11 , 9) ; return }
    fivepow11 -= to9
    if (fivepow11 < to8) { loadSevens(fivepow11 , 8) ; return }
    fivepow11 -= to8
    if (fivepow11 < 3*to7) loadSevens(fivepow11 % to7 , 7)
    else loadSevens
  }
  def apply() = {
    if (nsevens==0) loadSevens
    nsevens -= 1
    sevens(nsevens)
  }
}

If you paste a test into the interpreter (REPL actually), you get:

scala> val five = new Random5
five: Random5 = Random5@e9c592

scala> val seven = new FiveSevener(five)
seven: FiveSevener = FiveSevener@143c423

scala> val counts = new Array[Int](7)
counts: Array[Int] = Array(0, 0, 0, 0, 0, 0, 0)

scala> var i=0 ; while (i < 100000000) { counts( seven() ) += 1 ; i += 1 }
i: Int = 100000000

scala> counts
res0: Array[Int] = Array(14280662, 14293012, 14281286, 14284836, 14287188,
14289332, 14283684)

scala> five.count
res1: Int = 125902876

The distribution is nice and flat (within about 10k of 1/7 of 10^8 in each bin, as expected from an approximately-Gaussian distribution).


This is equivalent to Adam Rosenfield's solution, but may be a bit more clear for some readers. It assumes rand5() is a function that returns a statistically random integer in the range 1 through 5 inclusive.

int rand7()
{
    int vals[5][5] = {
        { 1, 2, 3, 4, 5 },
        { 6, 7, 1, 2, 3 },
        { 4, 5, 6, 7, 1 },
        { 2, 3, 4, 5, 6 },
        { 7, 0, 0, 0, 0 }
    };

    int result = 0;
    while (result == 0)
    {
        int i = rand5();
        int j = rand5();
        result = vals[i-1][j-1];
    }
    return result;
}

How does it work? Think of it like this: imagine printing out this double-dimension array on paper, tacking it up to a dart board and randomly throwing darts at it. If you hit a non-zero value, it's a statistically random value between 1 and 7, since there are an equal number of non-zero values to choose from. If you hit a zero, just keep throwing the dart until you hit a non-zero. That's what this code is doing: the i and j indexes randomly select a location on the dart board, and if we don't get a good result, we keep throwing darts.

Like Adam said, this can run forever in the worst case, but statistically the worst case never happens. :)


For values 0-7 you have the following:

0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111

From bitwise from left to right Rand5() has p(1) = {2/5, 2/5, 3/5}. So if we complement those probability distributions (~Rand5()) we should be able to use that to produce our number. I'll try to report back later with a solution. Anyone have any thoughts?

R


We are using the convention rand(n) -> [0, n - 1] here

From many of the answer I read, they provide either uniformity or halt guarantee, but not both (adam rosenfeld second answer might).

It is, however, possible to do so. We basically have this distribution:

rand5_proba.png

This leaves us a hole in the distribution over [0-6]: 5 and 6 have no probability of ocurrence. Imagine now we try to fill the hole it by shifting the probability distribution and summing.

Indeed, we can the initial distribution with itself shifted by one, and repeating by summing the obtained distribution with the initial one shifted by two, then three and so on, until 7, not included (we covered the whole range). This is shown on the following figure. The order of the colors, corresponding to the steps, is blue -> green -> cyan -> white -> magenta -> yellow -> red.

fig_moving_average_proba.png

Because each slot is covered by 5 of the 7 shifted distributions (shift varies from 0 to 6), and because we assume the random numbers are independent from one ran5() call to another, we obtain

p(x) = 5 / 35 = 1 / 7       for all x in [0, 6]

This means that, given 7 independent random numbers from ran5(), we can compute a random number with uniform probability in the [0-6] range. In fact, the ran5() probability distribution does not even need to be uniform, as long as the samples are independent (so the distribution stays the same from trial to trial). Also, this is valid for other numbers than 5 and 7.

This gives us the following python function:

def rand_range_transform(rands):
    """
    returns a uniform random number in [0, len(rands) - 1]
    if all r in rands are independent random numbers from the same uniform distribution
    """
    return sum((x + i) for i, x in enumerate(rands)) % len(rands) # a single modulo outside the sum is enough in modulo arithmetic

This can be used like this:

rand5 = lambda : random.randrange(5)

def rand7():
    return rand_range_transform([rand5() for _ in range(7)])

If we call rand7() 70000 times, we can get:

max: 6 min: 0 mean: 2.99711428571 std: 2.00194697049
0:  10019
1:  10016
2:  10071
3:  10044
4:  9775
5:  10042
6:  10033

This is good, although far from perfect. The fact is, one of our assumption is most likely false in this implementation: we use a PRNG, and as such, the result of the next call is dependent from the last result.

That said, using a truly random source of numbers, the output should also be truly random. And this algorithm terminates in every case.

But this comes with a cost: we need 7 calls to rand5() for a single rand7() call.


As long as there aren't seven possibilities left to choose from, draw another random number, which multiplies the number of possibilities by five. In Perl:

$num = 0;
$possibilities = 1;

sub rand7
{
  while( $possibilities < 7 )
  {
    $num = $num * 5 + int(rand(5));
    $possibilities *= 5;
  }
  my $result = $num % 7;
  $num = int( $num / 7 );
  $possibilities /= 7;
  return $result;
}

Simple and efficient:

int rand7 ( void )
{
    return 4; // this number has been calculated using
              // rand5() and is in the range 1..7
}

(Inspired by What's your favorite "programmer" cartoon?).


extern int r5();

int r7() {
    return ((r5() & 0x01) << 2 ) | ((r5() & 0x01) << 1 ) | (r5() & 0x01);
}

I'd like to add another answer, in addition to my first answer. This answer attempts to minimize the number of calls to rand5() per call to rand7(), to maximize the usage of randomness. That is, if you consider randomness to be a precious resource, we want to use as much of it as possible, without throwing away any random bits. This answer also has some similarities with the logic presented in Ivan's answer.

The entropy of a random variable is a well-defined quantity. For a random variable which takes on N states with equal probabilities (a uniform distribution), the entropy is log2 N. Thus, rand5() has approximately 2.32193 bits of entropy, and rand7() has about 2.80735 bits of entropy. If we hope to maximize our use of randomness, we need to use all 2.32193 bits of entropy from each call to rand5(), and apply them to generating 2.80735 bits of entropy needed for each call to rand7(). The fundamental limit, then, is that we can do no better than log(7)/log(5) = 1.20906 calls to rand5() per call to rand7().

Side notes: all logarithms in this answer will be base 2 unless specified otherwise. rand5() will be assumed to return numbers in the range [0, 4], and rand7() will be assumed to return numbers in the range [0, 6]. Adjusting the ranges to [1, 5] and [1, 7] respectively is trivial.

So how do we do it? We generate an infinitely precise random real number between 0 and 1 (pretend for the moment that we could actually compute and store such an infinitely precise number -- we'll fix this later). We can generate such a number by generating its digits in base 5: we pick the random number 0.a1a2a3..., where each digit ai is chosen by a call to rand5(). For example, if our RNG chose ai = 1 for all i, then ignoring the fact that that isn't very random, that would correspond to the real number 1/5 + 1/52 + 1/53 + ... = 1/4 (sum of a geometric series).

Ok, so we've picked a random real number between 0 and 1. I now claim that such a random number is uniformly distributed. Intuitively, this is easy to understand, since each digit was picked uniformly, and the number is infinitely precise. However, a formal proof of this is somewhat more involved, since now we're dealing with a continuous distribution instead of a discrete distribution, so we need to prove that the probability that our number lies in an interval [a, b] equals the length of that interval, b - a. The proof is left as an exercise for the reader =).

Now that we have a random real number selected uniformly from the range [0, 1], we need to convert it to a series of uniformly random numbers in the range [0, 6] to generate the output of rand7(). How do we do this? Just the reverse of what we just did -- we convert it to an infinitely precise decimal in base 7, and then each base 7 digit will correspond to one output of rand7().

Taking the example from earlier, if our rand5() produces an infinite stream of 1's, then our random real number will be 1/4. Converting 1/4 to base 7, we get the infinite decimal 0.15151515..., so we will produce as output 1, 5, 1, 5, 1, 5, etc.

Ok, so we have the main idea, but we have two problems left: we can't actually compute or store an infinitely precise real number, so how do we deal with only a finite portion of it? Secondly, how do we actually convert it to base 7?

One way we can convert a number between 0 and 1 to base 7 is as follows:

  1. Multiply by 7
  2. The integral part of the result is the next base 7 digit
  3. Subtract off the integral part, leaving only the fractional part
  4. Goto step 1

To deal with the problem of infinite precision, we compute a partial result, and we also store an upper bound on what the result could be. That is, suppose we've called rand5() twice and it returned 1 both times. The number we've generated so far is 0.11 (base 5). Whatever the rest of the infinite series of calls to rand5() produce, the random real number we're generating will never be larger than 0.12: it is always true that 0.11 = 0.11xyz... < 0.12.

So, keeping track of the current number so far, and the maximum value it could ever take, we convert both numbers to base 7. If they agree on the first k digits, then we can safely output the next k digits -- regardless of what the infinite stream of base 5 digits are, they will never affect the next k digits of the base 7 representation!

And that's the algorithm -- to generate the next output of rand7(), we generate only as many digits of rand5() as we need to ensure that we know with certainty the value of the next digit in the conversion of the random real number to base 7. Here is a Python implementation, with a test harness:

import random

rand5_calls = 0
def rand5():
    global rand5_calls
    rand5_calls += 1
    return random.randint(0, 4)

def rand7_gen():
    state = 0
    pow5 = 1
    pow7 = 7
    while True:
        if state / pow5 == (state + pow7) / pow5:
            result = state / pow5
            state = (state - result * pow5) * 7
            pow7 *= 7
            yield result
        else:
            state = 5 * state + pow7 * rand5()
            pow5 *= 5

if __name__ == '__main__':
    r7 = rand7_gen()
    N = 10000
    x = list(next(r7) for i in range(N))
    distr = [x.count(i) for i in range(7)]
    expmean = N / 7.0
    expstddev = math.sqrt(N * (1.0/7.0) * (6.0/7.0))

    print '%d TRIALS' % N
    print 'Expected mean: %.1f' % expmean
    print 'Expected standard deviation: %.1f' % expstddev
    print
    print 'DISTRIBUTION:'
    for i in range(7):
        print '%d: %d   (%+.3f stddevs)' % (i, distr[i], (distr[i] - expmean) / expstddev)
    print
    print 'Calls to rand5: %d (average of %f per call to rand7)' % (rand5_calls, float(rand5_calls) / N)

Note that rand7_gen() returns a generator, since it has internal state involving the conversion of the number to base 7. The test harness calls next(r7) 10000 times to produce 10000 random numbers, and then it measures their distribution. Only integer math is used, so the results are exactly correct.

Also note that the numbers here get very big, very fast. Powers of 5 and 7 grow quickly. Hence, performance will start to degrade noticeably after generating lots of random numbers, due to bignum arithmetic. But remember here, my goal was to maximize the usage of random bits, not to maximize performance (although that is a secondary goal).

In one run of this, I made 12091 calls to rand5() for 10000 calls to rand7(), achieving the minimum of log(7)/log(5) calls on average to 4 significant figures, and the resulting output was uniform.

In order to port this code to a language that doesn't have arbitrarily large integers built-in, you'll have to cap the values of pow5 and pow7 to the maximum value of your native integral type -- if they get too big, then reset everything and start over. This will increase the average number of calls to rand5() per call to rand7() very slightly, but hopefully it shouldn't increase too much even for 32- or 64-bit integers.


The function you need is rand1_7(), I wrote rand1_5() so that you can test it and plot it.

import numpy
def rand1_5():
    return numpy.random.randint(5)+1

def rand1_7():
    q = 0
    for i in xrange(7):  q+= rand1_5()
    return q%7 + 1

This is the answer I came up with but these complicated answers are making me think this is completely off/ :))

import random

def rand5():
    return float(random.randint(0,5))

def rand7():
    random_val = rand5()
    return float(random.randint((random_val-random_val),7))

print rand7()

First thing came on my mind is this. But i have no idea whether its uniformly distributed. Implemented in python

import random

def rand5():

return random.randint(1,5)

def rand7():

return ( ( (rand5() -1) * rand5() ) %7 )+1


Here is a working Python implementation of Adam's answer.

import random

def rand5():
    return random.randint(1, 5)

def rand7():
    while True:
        r = 5 * (rand5() - 1) + rand5()
        #r is now uniformly random between 1 and 25
        if (r <= 21):
            break
    #result is now uniformly random between 1 and 7
    return r % 7 + 1

I like to throw algorithms I'm looking at into Python so I can play around with them, thought I'd post it here in the hopes that it is useful to someone out there, not that it took long to throw together.


The premise behind Adam Rosenfield's correct answer is:

  • x = 5^n (in his case: n=2)
  • manipulate n rand5 calls to get a number y within range [1, x]
  • z = ((int)(x / 7)) * 7
  • if y > z, try again. else return y % 7 + 1

When n equals 2, you have 4 throw-away possibilities: y = {22, 23, 24, 25}. If you use n equals 6, you only have 1 throw-away: y = {15625}.

5^6 = 15625
7 * 2232 = 15624

You call rand5 more times. However, you have a much lower chance of getting a throw-away value (or an infinite loop). If there is a way to get no possible throw-away value for y, I haven't found it yet.


I don't like ranges starting from 1, so I'll start from 0 :-)

unsigned rand5()
{
    return rand() % 5;
}

unsigned rand7()
{
    int r;

    do
    {
        r =         rand5();
        r = r * 5 + rand5();
        r = r * 5 + rand5();
        r = r * 5 + rand5();
        r = r * 5 + rand5();
        r = r * 5 + rand5();
    } while (r > 15623);

    return r / 2232;
}

There you go, uniform distribution and zero rand5 calls.

def rand7:
    seed += 1
    if seed >= 7:
        seed = 0
    yield seed

Need to set seed beforehand.


int rand7()
{
    int zero_one_or_two = ( rand5() + rand5() - 1 ) % 3 ;
    return rand5() + zero_one_or_two ;
}

Given a function which produces a random integer in the range 1 to 5 rand5(), write a function which produces a random integer in the range 1 to 7 rand7()

In my proposed solution, I only call rand5 once only

Real Solution

float rand7()
{
    return (rand5() * 7.0) / 5.0 ;
}

The distribution here is scaled, so it depends directly on the distribution of rand5

Integer Solution

int rand7()
{
    static int prev = 1;

    int cur = rand5();

    int r = cur * prev; // 1-25

    float f = r / 4.0; // 0.25-6.25

    f = f - 0.25; // 0-6

    f = f + 1.0; // 1-7

    prev = cur;

    return (int)f;
}

The distribution here depends on the series rand7(i) ~ rand5(i) * rand5(i-1)

with rand7(0) ~ rand5(0) * 1


the main conception of this problem is about normal distribution, here provided a simple and recursive solution to this problem

presume we already have rand5() in our scope:

def rand7():
    # twoway = 0 or 1 in the same probability
    twoway = None
    while not twoway in (1, 2):
        twoway = rand5()
    twoway -= 1

    ans = rand5() + twoway * 5

    return ans if ans in range(1,8) else rand7()

Explanation

We can divide this program into 2 parts:

  1. looping rand5() until we found 1 or 2, that means we have 1/2 probability to have 1 or 2 in the variable twoway
  2. composite ans by rand5() + twoway * 5, this is exactly the result of rand10(), if this did not match our need (1~7), then we run rand7 again.

P.S. we cannot directly run a while loop in the second part due to each probability of twoway need to be individual.

But there is a trade-off, because of the while loop in the first section and the recursion in the return statement, this function doesn't guarantee the execution time, it is actually not effective.

Result

I've made a simple test for observing the distribution to my answer.

result = [ rand7() for x in xrange(777777) ]

ans = {
    1: 0,
    2: 0,
    3: 0,
    4: 0,
    5: 0,
    6: 0,
    7: 0,
}

for i in result:
    ans[i] += 1

print ans

It gave

{1: 111170, 2: 110693, 3: 110651, 4: 111260, 5: 111197, 6: 111502, 7: 111304}

Therefore we could know this answer is in a normal distribution.

Simplified Answer

If you don't care about the execution time of this function, here's a simplified answer based on the above answer I gave:

def rand7():
    ans = rand5() + (rand5()-1) * 5
    return ans if ans < 8 else rand7()

This augments the probability of value which is greater than 8 but probably will be the shortest answer to this problem.


Came here from a link from expanding a float range. This one is more fun. Instead of how I got to the conclusion, it occurred to me that for a given random integer generating function f with "base" b (4 in this case,i'll tell why), it can be expanded like below:

(b^0 * f() + b^1 * f() + b^2 * f() .... b^p * f()) / (b^(p+1) - 1) * (b-1)

This will convert a random generator to a FLOAT generator. I will define 2 parameters here the b and the p. Although the "base" here is 4, b can in fact be anything, it can also be an irrational number etc. p, i call precision is a degree of how well grained you want your float generator to be. Think of this as the number of calls made to rand5 for each call of rand7.

But I realized if you set b to base+1 (which is 4+1 = 5 in this case), it's a sweet spot and you'll get a uniform distribution. First get rid of this 1-5 generator, it is in truth rand4() + 1:

function rand4(){
    return Math.random() * 5 | 0;
}

To get there, you can substitute rand4 with rand5()-1

Next is to convert rand4 from an integer generator to a float generator

function toFloat(f,b,p){
    b = b || 2;
    p = p || 3;
    return (Array.apply(null,Array(p))
    .map(function(d,i){return f()})
    .map(function(d,i){return Math.pow(b,i)*d})
    .reduce(function(ac,d,i){return ac += d;}))
    /
    (
        (Math.pow(b,p) - 1)
        /(b-1)
    )
}

This will apply the first function I wrote to a given rand function. Try it:

toFloat(rand4) //1.4285714285714286 base = 2, precision = 3
toFloat(rand4,3,4) //0.75 base = 3, precision = 4
toFloat(rand4,4,5) //3.7507331378299122 base = 4, precision = 5
toFloat(rand4,5,6) //0.2012288786482335 base = 5, precision =6
...

Now you can convert this float range (0-4 INCLUSIVE) to any other float range and then downgrade it to be an integer. Here our base is 4 because we are dealing with rand4, therefore a value b=5 will give you a uniform distribution. As the b grows past 4, you will start introducing periodic gaps in the distribution. I tested for b values ranging from 2 to 8 with 3000 points each and compared to native Math.random of javascript, looks to me even better than the native one itself:

http://jsfiddle.net/ibowankenobi/r57v432t/

For the above link, click on the "bin" button on the top side of the distributions to decrease the binning size. The last graph is native Math.random, the 4th one where d=5 is uniform.

After you get your float range either multiply with 7 and throw the decimal part or multiply with 7, subtract 0.5 and round:

((toFloat(rand4,5,6)/4 * 7) | 0) + 1   ---> occasionally you'll get 8 with 1/4^6 probability.
Math.round((toFloat(rand4,5,6)/4 * 7) - 0.5) + 1 --> between 1 and 7

The simple solution has been well covered: take two random5 samples for one random7 result and do it over if the result is outside the range that generates a uniform distribution. If your goal is to reduce the number of calls to random5 this is extremely wasteful - the average number of calls to random5 for each random7 output is 2.38 rather than 2 due to the number of thrown away samples.

You can do better by using more random5 inputs to generate more than one random7 output at a time. For results calculated with a 31-bit integer, the optimum comes when using 12 calls to random5 to generate 9 random7 outputs, taking an average of 1.34 calls per output. It's efficient because only 2018983 out of 244140625 results need to be scrapped, or less than 1%.

Demo in Python:

def random5():
    return random.randint(1, 5)

def random7gen(n):
    count = 0
    while n > 0:
        samples = 6 * 7**9
        while samples >= 6 * 7**9:
            samples = 0
            for i in range(12):
                samples = samples * 5 + random5() - 1
                count += 1
        samples //= 6
        for outputs in range(9):
            yield samples % 7 + 1, count
            samples //= 7
            count = 0
            n -= 1
            if n == 0: break

>>> from collections import Counter
>>> Counter(x for x,i in random7gen(10000000))
Counter({2: 1430293, 4: 1429298, 1: 1428832, 7: 1428571, 3: 1428204, 5: 1428134, 6: 1426668})
>>> sum(i for x,i in random7gen(10000000)) / 10000000.0
1.344606

This solution doesn't waste any entropy and gives the first available truly random number in range. With each iteration the probability of not getting an answer is provably decreased. The probability of getting an answer in N iterations is the probability that a random number between 0 and max (5^N) will be smaller than the largest multiple of seven in that range (max-max%7). Must iterate at least twice. But that's necessarily true for all solutions.

int random7() {
  range = 1;
  remainder = 0;

  while (1) {
    remainder = remainder * 5 + random5() - 1;
    range = range * 5;

    limit = range - (range % 7);
    if (remainder < limit) return (remainder % 7) + 1;

    remainder = remainder % 7;
    range = range % 7;
  }
}

Numerically equivalent to:

r5=5;
num=random5()-1;
while (1) {
   num=num*5+random5()-1;
   r5=r5*5;
   r7=r5-r5%7;
   if (num<r7) return num%7+1;
}

The first code calculates it in modulo form. The second code is just plain math. Or I made a mistake somewhere. :-)


Assuming rand gives equal weighting to all bits, then masks with the upper bound.

int i = rand(5) ^ (rand(5) & 2);

rand(5) can only return: 1b, 10b, 11b, 100b, 101b. You only need to concern yourself with sometimes setting the 2 bit.


Why not do it simple?

int random7() {
  return random5() + (random5() % 3);
}

The chances of getting 1 and 7 in this solution is lower due to the modulo, however, if you just want a quick and readable solution, this is the way to go.


function rand7() {
    while (true) { //lowest base 5 random number > 7 reduces memory
        int num = (rand5()-1)*5 + rand5()-1;
    if (num < 21)  // improves performance
        return 1 + num%7;
    }
}

Python code:

from random import randint
def rand7():
    while(True):
        num = (randint(1, 5)-1)*5 + randint(1, 5)-1
        if num < 21:
                return 1 + num%7

Test distribution for 100000 runs:

>>> rnums = []
>>> for _ in range(100000):
    rnums.append(rand7())
>>> {n:rnums.count(n) for n in set(rnums)}
{1: 15648, 2: 15741, 3: 15681, 4: 15847, 5: 15642, 6: 15806, 7: 15635}

how about this

rand5()%2+rand5()%2+rand5()%2+rand5()%2+rand5()%2+rand5()%2

Not sure this is uniform distributed. Any suggestions?


This is similiarly to @RobMcAfee except that I use magic number instead of 2 dimensional array.

int rand7() {
    int m = 1203068;
    int r = (m >> (rand5() - 1) * 5 + rand5() - 1) & 7;

    return (r > 0) ? r : rand7();
}

I know it has been answered, but is this seems to work ok, but I can not tell you if it has a bias. My 'testing' suggests it is, at least, reasonable.

Perhaps Adam Rosenfield would be kind enough to comment?

My (naive?) idea is this:

Accumulate rand5's until there is enough random bits to make a rand7. This takes at most 2 rand5's. To get the rand7 number I use the accumulated value mod 7.

To avoid the accumulator overflowing, and since the accumulator is mod 7 then I take the mod 7 of the accumulator:

(5a + rand5) % 7 = (k*7 + (5a%7) + rand5) % 7 = ( (5a%7) + rand5) % 7

The rand7() function follows:

(I let the range of rand5 be 0-4 and rand7 is likewise 0-6.)

int rand7(){
  static int    a=0;
  static int    e=0;
  int       r;
  a = a * 5 + rand5();
  e = e + 5;        // added 5/7ths of a rand7 number
  if ( e<7 ){
    a = a * 5 + rand5();
    e = e + 5;  // another 5/7ths
  }
  r = a % 7;
  e = e - 7;        // removed a rand7 number
  a = a % 7;
  return r;
}

Edit: Added results for 100 million trials.

'Real' rand functions mod 5 or 7

rand5 : avg=1.999802 0:20003944 1:19999889 2:20003690 3:19996938 4:19995539 rand7 : avg=3.000111 0:14282851 1:14282879 2:14284554 3:14288546 4:14292388 5:14288736 6:14280046

My rand7

Average looks ok and number distributions look ok too.

randt : avg=3.000080 0:14288793 1:14280135 2:14287848 3:14285277 4:14286341 5:14278663 6:14292943


I think I have four answers, two giving exact solutions like that of @Adam Rosenfield but without the infinite loop problem, and other two with almost perfect solution but faster implementation than first one.

The best exact solution requires 7 calls to rand5, but lets proceed in order to understand.

Method 1 - Exact

Strength of Adam's answer is that it gives a perfect uniform distribution, and there is very high probability (21/25) that only two calls to rand5() will be needed. However, worst case is infinite loop.

The first solution below also gives a perfect uniform distribution, but requires a total of 42 calls to rand5. No infinite loops.

Here is an R implementation:

rand5 <- function() sample(1:5,1)

rand7 <- function()  (sum(sapply(0:6, function(i) i + rand5() + rand5()*2 + rand5()*3 + rand5()*4 + rand5()*5 + rand5()*6)) %% 7) + 1

For people not familiar with R, here is a simplified version:

rand7 = function(){
  r = 0 
  for(i in 0:6){
    r = r + i + rand5() + rand5()*2 + rand5()*3 + rand5()*4 + rand5()*5 + rand5()*6
  }
  return r %% 7 + 1
}

The distribution of rand5 will be preserved. If we do the math, each of the 7 iterations of the loop has 5^6 possible combinations, thus total number of possible combinations are (7 * 5^6) %% 7 = 0. Thus we can divide the random numbers generated in equal groups of 7. See method two for more discussion on this.

Here are all the possible combinations:

table(apply(expand.grid(c(outer(1:5,0:6,"+")),(1:5)*2,(1:5)*3,(1:5)*4,(1:5)*5,(1:5)*6),1,sum) %% 7 + 1)

    1     2     3     4     5     6     7 
15625 15625 15625 15625 15625 15625 15625 

I think it's straight forward to show that Adam's method will run much much faster. The probability that there are 42 or more calls to rand5 in Adam's solution is very small ((4/25)^21 ~ 10^(-17)).

Method 2 - Not Exact

Now the second method, which is almost uniform, but requires 6 calls to rand5:

rand7 <- function() (sum(sapply(1:6,function(i) i*rand5())) %% 7) + 1

Here is a simplified version:

rand7 = function(){
  r = 0 
  for(i in 1:6){
    r = r + i*rand5()
  }
  return r %% 7 + 1
}

This is essentially one iteration of method 1. If we generate all possible combinations, here is resulting counts:

table(apply(expand.grid(1:5,(1:5)*2,(1:5)*3,(1:5)*4,(1:5)*5,(1:5)*6),1,sum) %% 7 + 1)

   1    2    3    4    5    6    7 
2233 2232 2232 2232 2232 2232 2232

One number will appear once more in 5^6 = 15625 trials.

Now, in Method 1, by adding 1 to 6, we move the number 2233 to each of the successive point. Thus the total number of combinations will match up. This works because 5^6 %% 7 = 1, and then we do 7 appropriate variations, so (7 * 5^6 %% 7 = 0).

Method 3 - Exact

If the argument of method 1 and 2 is understood, method 3 follows, and requires only 7 calls to rand5. At this point, I feel this is the minimum number of calls needed for an exact solution.

Here is an R implementation:

rand5 <- function() sample(1:5,1)

rand7 <- function()  (sum(sapply(1:7, function(i) i * rand5())) %% 7) + 1

For people not familiar with R, here is a simplified version:

rand7 = function(){
  r = 0 
  for(i in 1:7){
    r = r + i * rand5()
  }
  return r %% 7 + 1
}

The distribution of rand5 will be preserved. If we do the math, each of the 7 iterations of the loop has 5 possible outcomes, thus total number of possible combinations are (7 * 5) %% 7 = 0. Thus we can divide the random numbers generated in equal groups of 7. See method one and two for more discussion on this.

Here are all the possible combinations:

table(apply(expand.grid(0:6,(1:5)),1,sum) %% 7 + 1)

1 2 3 4 5 6 7  
5 5 5 5 5 5 5 

I think it's straight forward to show that Adam's method will still run faster. The probability that there are 7 or more calls to rand5 in Adam's solution is still small ((4/25)^3 ~ 0.004).

Method 4 - Not Exact

This is a minor variation of the the second method. It is almost uniform, but requires 7 calls to rand5, that is one additional to method 2:

rand7 <- function() (rand5() + sum(sapply(1:6,function(i) i*rand5())) %% 7) + 1

Here is a simplified version:

rand7 = function(){
  r = 0 
  for(i in 1:6){
    r = r + i*rand5()
  }
  return (r+rand5()) %% 7 + 1
}

If we generate all possible combinations, here is resulting counts:

table(apply(expand.grid(1:5,(1:5)*2,(1:5)*3,(1:5)*4,(1:5)*5,(1:5)*6,1:5),1,sum) %% 7 + 1)

    1     2     3     4     5     6     7 
11160 11161 11161 11161 11161 11161 11160

Two numbers will appear once less in 5^7 = 78125 trials. For most purposes, I can live with that.


Why don't you just divide by 5 and multiply by 7, and then round? (Granted, you would have to use floating-point no.s)

It's much easier and more reliable (really?) than the other solutions. E.g. in Python:

def ranndomNo7():
    import random
    rand5 = random.randint(4)    # Produces range: [0, 4]
    rand7 = int(rand5 / 5 * 7)   # /5, *7, +0.5 and floor()
    return rand7

Wasn't that easy?


Here is an answer taking advantage of features in C++ 11

#include <functional>
#include <iostream>
#include <ostream>
#include <random>

int main()
{
    std::random_device rd;
    unsigned long seed = rd();
    std::cout << "seed = " << seed << std::endl;

    std::mt19937 engine(seed);

    std::uniform_int_distribution<> dist(1, 5);
    auto rand5 = std::bind(dist, engine);

    const int n = 20;
    for (int i = 0; i != n; ++i)
    {
        std::cout << rand5() << " ";
    }
    std::cout << std::endl;

    // Use a lambda expression to define rand7
    auto rand7 = [&rand5]()->int
    {
        for (int result = 0; ; result = 0)
        {
            // Take advantage of the fact that
            // 5**6 = 15625 = 15624 + 1 = 7 * (2232) + 1.
            // So we only have to discard one out of every 15625 numbers generated.

            // Generate a 6-digit number in base 5
            for (int i = 0; i != 6; ++i)
            {
                result = 5 * result + (rand5() - 1);
            }

            // result is in the range [0, 15625)
            if (result == 15625 - 1)
            {
                // Discard this number
                continue;
            }

            // We now know that result is in the range [0, 15624), a range that can
            // be divided evenly into 7 buckets guaranteeing uniformity
            result /= 2232;
            return 1 + result;
        }
    };

    for (int i = 0; i != n; ++i)
    {
        std::cout << rand7() << " ";
    }
    std::cout << std::endl;

    return 0;
}

solution in php

<?php
function random_5(){
    return rand(1,5);
}


function random_7(){
 $total = 0;

    for($i=0;$i<7;$i++){
        $total += random_5();
    }

    return ($total%7)+1; 
}

echo random_7();
?>

Here's my general implementation, to generate a uniform in the range [0,N-1] given a uniform generator in the range [0,B-1].

public class RandomUnif {

    public static final int BASE_NUMBER = 5;

    private static Random rand = new Random();

    /** given generator, returns uniform integer in the range 0.. BASE_NUMBER-1
    public static int randomBASE() {
        return rand.nextInt(BASE_NUMBER);
    }

    /** returns uniform integer in the range 0..n-1 using randomBASE() */
    public static int randomUnif(int n) {
        int rand, factor;
        if( n <= 1 ) return 0;
        else if( n == BASE_NUMBER ) return randomBASE();
        if( n < BASE_NUMBER ) {
            factor = BASE_NUMBER / n;
            do
                rand = randomBASE() / factor;
            while(rand >= n);
            return rand;
        } else {
            factor = (n - 1) / BASE_NUMBER + 1;
            do {
                rand = factor * randomBASE() + randomUnif(factor);
            } while(rand >= n);
            return rand;
        }
    }
}

Not spectaculary efficient, but general and compact. Mean calls to base generator:

 n  calls
 2  1.250 
 3  1.644 
 4  1.252 
 5  1.000 
 6  3.763 
 7  3.185 
 8  2.821 
 9  2.495 
10  2.250 
11  3.646 
12  3.316 
13  3.060 
14  2.853 
15  2.650 
16  2.814 
17  2.644 
18  2.502 
19  2.361 
20  2.248 
21  2.382 
22  2.277 
23  2.175 
24  2.082 
25  2.000 
26  5.472 
27  5.280 
28  5.119 
29  4.899 

There is no (exactly correct) solution which will run in a constant amount of time, since 1/7 is an infinite decimal in base 5. One simple solution would be to use rejection sampling, e.g.:


int i;
do
{
  i = 5 * (rand5() - 1) + rand5();  // i is now uniformly random between 1 and 25
} while(i > 21);
// i is now uniformly random between 1 and 21
return i % 7 + 1;  // result is now uniformly random between 1 and 7

This has an expected runtime of 25/21 = 1.19 iterations of the loop, but there is an infinitesimally small probability of looping forever.


int getOneToSeven(){
    int added = 0;
    for(int i = 1; i<=7; i++){
        added += getOneToFive();
    }
    return (added)%7+1;
}

int randbit( void )
{
    while( 1 )
    {
        int r = rand5();
        if( r <= 4 ) return(r & 1);
    }
}

int randint( int nbits )
{
    int result = 0;
    while( nbits-- )
    {
        result = (result<<1) | randbit();
    }
    return( result );
}

int rand7( void )
{
    while( 1 )
    {
        int r = randint( 3 ) + 1;
        if( r <= 7 ) return( r );
    }
}

def rand5():
    return random.randint(1,5)    #return random integers from 1 to 5

def rand7():
    rand = rand5()+rand5()-1
    if rand > 7:                  #if numbers > 7, call rand7() again
        return rand7()
    print rand%7 + 1

I guess this will the easiest solution but everywhere people have suggested 5*rand5() + rand5() - 5 like in http://www.geeksforgeeks.org/generate-integer-from-1-to-7-with-equal-probability/. Can someone explain what is wrong with rand5()+rand5()-1


int rand7() {
    int value = rand5()
              + rand5() * 2
              + rand5() * 3
              + rand5() * 4
              + rand5() * 5
              + rand5() * 6;
    return value%7;
}

Unlike the chosen solution, the algorithm will run in constant time. It does however make 2 more calls to rand5 than the average run time of the chosen solution.

Note that this generator is not perfect (the number 0 has 0.0064% more chance than any other number), but for most practical purposes the guarantee of constant time probably outweighs this inaccuracy.

Explanation

This solution is derived from the fact that the number 15,624 is divisible by 7 and thus if we can randomly and uniformly generate numbers from 0 to 15,624 and then take mod 7 we can get a near-uniform rand7 generator. Numbers from 0 to 15,624 can be uniformly generated by rolling rand5 6 times and using them to form the digits of a base 5 number as follows:

rand5 * 5^5 + rand5 * 5^4 + rand5 * 5^3 + rand5 * 5^2 + rand5 * 5 + rand5

Properties of mod 7 however allow us to simplify the equation a bit:

5^5 = 3 mod 7
5^4 = 2 mod 7
5^3 = 6 mod 7
5^2 = 4 mod 7
5^1 = 5 mod 7

So

rand5 * 5^5 + rand5 * 5^4 + rand5 * 5^3 + rand5 * 5^2 + rand5 * 5 + rand5

becomes

rand5 * 3 + rand5 * 2 + rand5 * 6 + rand5 * 4 + rand5 * 5 + rand5

Theory

The number 15,624 was not chosen randomly, but can be discovered using fermat's little theorem, which states that if p is a prime number then

a^(p-1) = 1 mod p

So this gives us,

(5^6)-1 = 0 mod 7

(5^6)-1 is equal to

4 * 5^5 + 4 * 5^4 + 4 * 5^3 + 4 * 5^2 + 4 * 5 + 4

This is a number in base 5 form and thus we can see that this method can be used to go from any random number generator to any other random number generator. Though a small bias towards 0 is always introduced when using the exponent p-1.

To generalize this approach and to be more accurate we can have a function like this:

def getRandomconverted(frm, to):
    s = 0
    for i in range(to):
        s += getRandomUniform(frm)*frm**i
    mx = 0
    for i in range(to):
        mx = (to-1)*frm**i 
    mx = int(mx/to)*to # maximum value till which we can take mod
    if s < mx:
        return s%to
    else:
        return getRandomconverted(frm, to)

package CareerCup;

public class RangeTransform {
 static int counter = (int)(Math.random() * 5 + 1);

 private int func() {
  return (int) (Math.random() * 5 + 1);
 }

 private int getMultiplier() {
  return counter % 5 + 1;
 }

 public int rangeTransform() {
  counter++;
  int count = getMultiplier();
  int mult = func() + 5 * count;
  System.out.println("Mult is : " + 5 * count);
  return (mult) % 7 + 1;
 }

 /**
  * @param args
  */
 public static void main(String[] args) {
  // TODO Auto-generated method stub
  RangeTransform rangeTransform = new RangeTransform();
  for (int i = 0; i < 35; i++)
   System.out.println("Val is : " + rangeTransform.rangeTransform());
 }
}

I think y'all are overthinking this. Doesn't this simple solution work?

int rand7(void)
{
    static int startpos = 0;
    startpos = (startpos+5) % (5*7);
    return (((startpos + rand5()-1)%7)+1);
}

#!/usr/bin/env ruby
class Integer
  def rand7
    rand(6)+1
  end
end

def rand5
  rand(4)+1
end

x = rand5() # x => int between 1 and 5

y = x.rand7() # y => int between 1 and 7

..although that may possibly be considered cheating..


Similar to Martin's answer, but resorts to throwing entropy away much less frequently:

int rand7(void) {
  static int m = 1;
  static int r = 0;

  for (;;) {
    while (m <= INT_MAX / 5) {
      r = r + m * (rand5() - 1);
      m = m * 5;
    }
    int q = m / 7;
    if (r < q * 7) {
      int i = r % 7;
      r = r / 7;
      m = q;
      return i + 1;
    }
    r = r - q * 7;
    m = m - q * 7;
  }
}

Here we build up a random value between 0 and m-1, and try to maximise m by adding as much state as will fit without overflow (INT_MAX being the largest value that will fit in an int in C, or you can replace that with any large value that makes sense in your language and architecture).

Then; if r falls within the largest possible interval evenly divisible by 7 then it contains a viable result and we can divide that interval by 7 and take the remainder as our result and return the rest of the value to our entropy pool. Otherwise r is in the other interval which doesn't divide evenly and we have to discard and restart our entropy pool from that ill-fitting interval.

Compared with the popular answers in here, it calls rand5() about half as often on average.

The divides can be factored out into trivial bit-twiddles and LUTs for performance.


(I have stolen Adam Rosenfeld's answer and made it run about 7% faster.)

Assume that rand5() returns one of {0,1,2,3,4} with equal distribution and the goal is return {0,1,2,3,4,5,6} with equal distribution.

int rand7() {
  i = 5 * rand5() + rand5();
  max = 25;
  //i is uniform among {0 ... max-1}
  while(i < max%7) {
    //i is uniform among {0 ... (max%7 - 1)}
    i *= 5;
    i += rand5(); //i is uniform {0 ... (((max%7)*5) - 1)}
    max %= 7;
    max *= 5; //once again, i is uniform among {0 ... max-1}
  }
  return(i%7);
}

We're keeping track of the largest value that the loop can make in the variable max. If the reult so far is between max%7 and max-1 then the result will be uniformly distrubuted in that range. If not, we use the remainder, which is random between 0 and max%7-1, and another call to rand() to make a new number and a new max. Then we start again.

Edit: Expect number of times to call rand5() is x in this equation:

x =  2     * 21/25
   + 3     *  4/25 * 14/20
   + 4     *  4/25 *  6/20 * 28/30
   + 5     *  4/25 *  6/20 *  2/30 * 7/10
   + 6     *  4/25 *  6/20 *  2/30 * 3/10 * 14/15
   + (6+x) *  4/25 *  6/20 *  2/30 * 3/10 *  1/15
x = about 2.21 calls to rand5()

Why won't this work? Other then the one extra call to rand5()?

i = rand5() + rand5() + (rand5() - 1) //Random number between 1 and 14

i = i % 7 + 1;

Assuming that rand(n) here means "random integer in a uniform distribution from 0 to n-1", here's a code sample using Python's randint, which has that effect. It uses only randint(5), and constants, to produce the effect of randint(7). A little silly, actually

from random import randint
sum = 7
while sum >= 7:
    first = randint(0,5)   
    toadd = 9999
    while toadd>1:
        toadd = randint(0,5)
    if toadd:
        sum = first+5
    else:
        sum = first

assert 7>sum>=0 
print sum

This answer is more an experiment in obtaining the most entropy possible from the Rand5 function. t is therefore somewhat unclear and almost certainly a lot slower than other implementations.

Assuming the uniform distribution from 0-4 and resulting uniform distribution from 0-6:

public class SevenFromFive
{
  public SevenFromFive()
  {
    // this outputs a uniform ditribution but for some reason including it 
    // screws up the output distribution
    // open question Why?
    this.fifth = new ProbabilityCondensor(5, b => {});
    this.eigth = new ProbabilityCondensor(8, AddEntropy);
  } 

  private static Random r = new Random();
  private static uint Rand5()
  {
    return (uint)r.Next(0,5);
  }

  private class ProbabilityCondensor
  {
    private readonly int samples;
    private int counter;
    private int store;
    private readonly Action<bool> output;

    public ProbabilityCondensor(int chanceOfTrueReciprocal,
      Action<bool> output)
    {
      this.output = output;
      this.samples = chanceOfTrueReciprocal - 1;  
    }

    public void Add(bool bit)
    {
      this.counter++;
      if (bit)
        this.store++;   
      if (counter == samples)
      {
        bool? e;
        if (store == 0)
          e = false;
        else if (store == 1)
          e = true;
        else
          e = null;// discard for now       
        counter = 0;
        store = 0;
        if (e.HasValue)
          output(e.Value);
      }
    }
  }

  ulong buffer = 0;
  const ulong Mask = 7UL;
  int bitsAvail = 0;
  private readonly ProbabilityCondensor fifth;
  private readonly ProbabilityCondensor eigth;

  private void AddEntropy(bool bit)
  {
    buffer <<= 1;
    if (bit)
      buffer |= 1;      
    bitsAvail++;
  }

  private void AddTwoBitsEntropy(uint u)
  {
    buffer <<= 2;
    buffer |= (u & 3UL);    
    bitsAvail += 2;
  }

  public uint Rand7()
  {
    uint selection;   
    do
    {
      while (bitsAvail < 3)
      {
        var x = Rand5();
        if (x < 4)
        {
          // put the two low order bits straight in
          AddTwoBitsEntropy(x);
          fifth.Add(false);
        }
        else
        { 
          fifth.Add(true);
        }
      }
      // read 3 bits
      selection = (uint)((buffer & Mask));
      bitsAvail -= 3;     
      buffer >>= 3;
      if (selection == 7)
        eigth.Add(true);
      else
        eigth.Add(false);
    }
    while (selection == 7);   
    return selection;
  }
}

The number of bits added to the buffer per call to Rand5 is currently 4/5 * 2 so 1.6. If the 1/5 probability value is included that increases by 0.05 so 1.65 but see the comment in the code where I have had to disable this.

Bits consumed by call to Rand7 = 3 + 1/8 * (3 + 1/8 * (3 + 1/8 * (...
This is 3 + 3/8 + 3/64 + 3/512 ... so approx 3.42

By extracting information from the sevens I reclaim 1/8*1/7 bits per call so about 0.018

This gives a net consumption 3.4 bits per call which means the ratio is 2.125 calls to Rand5 for every Rand7. The optimum should be 2.1.

I would imagine this approach is significantly slower than many of the other ones here unless the cost of the call to Rand5 is extremely expensive (say calling out to some external source of entropy).


rand7() = (rand5()+rand5()+rand5()+rand5()+rand5()+rand5()+rand5())%7+1

Edit: That doesn't quite work. It's off by about 2 parts in 1000 (assuming a perfect rand5). The buckets get:

value   Count  Error%
1       11158  -0.0035
2       11144  -0.0214
3       11144  -0.0214
4       11158  -0.0035
5       11172  +0.0144
6       11177  +0.0208
7       11172  +0.0144

By switching to a sum of

n   Error%
10  +/- 1e-3,
12  +/- 1e-4,
14  +/- 1e-5,
16  +/- 1e-6,
...
28  +/- 3e-11

seems to gain an order of magnitude for every 2 added

BTW: the table of errors above was not generated via sampling but by the following recurrence relation:

p[x,n] is the number ways output=x can happen given n calls to rand5.

  p[1,1] ... p[5,1] = 1
  p[6,1] ... p[7,1] = 0

  p[1,n] = p[7,n-1] + p[6,n-1] + p[5,n-1] + p[4,n-1] + p[3,n-1]
  p[2,n] = p[1,n-1] + p[7,n-1] + p[6,n-1] + p[5,n-1] + p[4,n-1]
  p[3,n] = p[2,n-1] + p[1,n-1] + p[7,n-1] + p[6,n-1] + p[5,n-1]
  p[4,n] = p[3,n-1] + p[2,n-1] + p[1,n-1] + p[7,n-1] + p[6,n-1]
  p[5,n] = p[4,n-1] + p[3,n-1] + p[2,n-1] + p[1,n-1] + p[7,n-1]
  p[6,n] = p[5,n-1] + p[4,n-1] + p[3,n-1] + p[2,n-1] + p[1,n-1]
  p[7,n] = p[6,n-1] + p[5,n-1] + p[4,n-1] + p[3,n-1] + p[2,n-1]

int rand7()
{
    return ( rand5() + (rand5()%3) );
}
  1. rand5() - Returns values from 1-5
  2. rand5()%3 - Returns values from 0-2
  3. So, when summing up the total value will be between 1-7