Picking a random element from a set

Question

How do I pick a random element from a set  I m particularly interested in picking a random element from a HashSet or a LinkedHashSet  in Java  Solutions for other languages are also welcome

User · Answer

The easiest with Java 8 is   outbound stream   skip n   outbound size    findFirst   get     where n is a random integer  Of course it is of less performance than that with the for elem  Col

User · Answer

A generic solution using Khoth s answer as a starting point           param set a Set in which to look for a random element     param  lt T gt  generic type of the Set elements     return a random element in the Set or null if the set is empty     public  lt T gt  T randomElement Set lt T gt  set        int size   set size        int item   random nextInt size       int i   0      for  T obj   set            if  i    item                return obj                    i              return null

User · Answer

C    This should be reasonably quick  as it doesn t require iterating over the whole set  or sorting it  This should work out of the box with most modern compilers  assuming they support tr1  If not  you may need to use Boost   The Boost docs are helpful here to explain this  even if you don t use Boost   The trick is to make use of the fact that the data has been divided into buckets  and to quickly identify a randomly chosen bucket  with the appropriate probability       include  lt boost unordered set hpp gt      using namespace boost   include  lt tr1 unordered set gt  using namespace std  tr1   include  lt iostream gt   include  lt stdlib h gt   include  lt assert h gt  using namespace std   int main       unordered set lt int gt  u    u max load factor 40     for  int i 0  i lt 40  i          u insert i       cout  lt  lt       lt  lt  i        cout  lt  lt  endl    cout  lt  lt   Number of buckets     lt  lt  u bucket count    lt  lt  endl     for size t b 0  b lt u bucket count    b        cout  lt  lt   Bucket    lt  lt  b  lt  lt    has    lt  lt  u bucket size b   lt  lt    elements     lt  lt  endl     for size t i 0  i lt 20  i          size t x   rand     u size        cout  lt  lt   we ll quickly get the    lt  lt  x  lt  lt   th item in the unordered set         size t b      for b 0  b lt u bucket count    b            if x  lt  u bucket size b             break          else         x -  u bucket size b             cout  lt  lt   it ll be in the    lt  lt  b  lt  lt   th bucket at offset    lt  lt  x  lt  lt            unordered set lt int gt   const local iterator l   u begin b       while x gt 0          l          assert l  u end b          x--            cout  lt  lt   random item is    lt  lt   l  lt  lt            cout  lt  lt  endl

User · Answer

How about just  public static  lt A gt  A getRandomElement Collection lt A gt  c  Random r      return new ArrayList lt A gt  c  get r nextInt c size

User · Answer

Clojure solution    defn pick-random  set   let  sq  seq set    nth sq  rand-int  count sq

User · Answer

In Java 8   static  lt E gt  E getRandomSetElement Set lt E gt  set        return set stream   skip new Random   nextInt set size     findFirst   orElse null

User · Answer

Clojure solution    defn pick-random  set   let  sq  seq set    nth sq  rand-int  count sq

User · Answer

This is faster than the for-each loop in the accepted answer:

int index = rand.nextInt(set.size());
Iterator<Object> iter = set.iterator();
for (int i = 0; i < index; i++) {
    iter.next();
}
return iter.next();

The for-each construct calls Iterator.hasNext() on every loop, but since index < set.size(), that check is unnecessary overhead. I saw a 10-20% boost in speed, but YMMV. (Also, this compiles without having to add an extra return statement.)

Note that this code (and most other answers) can be applied to any Collection, not just Set. In generic method form:

public static <E> E choice(Collection<? extends E> coll, Random rand) {
    if (coll.size() == 0) {
        return null; // or throw IAE, if you prefer
    }

    int index = rand.nextInt(coll.size());
    if (coll instanceof List) { // optimization
        return ((List<? extends E>) coll).get(index);
    } else {
        Iterator<? extends E> iter = coll.iterator();
        for (int i = 0; i < index; i++) {
            iter.next();
        }
        return iter.next();
    }
}

User · Answer

PHP  using MT    items array   array  alpha    bravo    charlie     last pos   count  items array  - 1   random pos   mt rand 0   last pos    random item    items array  random pos

User · Answer

In C           Random random   new Random  int DateTime Now Ticks            OrderedDictionary od   new OrderedDictionary             od Add  abc   1           od Add  def   2           od Add  ghi   3           od Add  jkl   4             int randomIndex   random Next od Count            Console WriteLine od randomIndex                Can access via index or key value          Console WriteLine od 1            Console WriteLine od  def

User · Answer

A somewhat related Did You Know    There are useful methods in java util Collections for shuffling whole collections  Collections shuffle List lt   gt   and Collections shuffle List lt   gt  list  Random rnd

User · Answer

If set size is not large then by using Arrays this can be done   int random  HashSet someSet   lt Type gt    randData  random   new Random System currentTimeMillis  nextInt someSet size     randData   someSet toArray     lt Type gt  sResult   randData random

User · Answer

Fast solution for Java using an ArrayList and a HashMap   element -  index    Motivation  I needed a set of items with RandomAccess properties  especially to pick a random item from the set  see pollRandom method   Random navigation in a binary tree is not accurate  trees are not perfectly balanced  which would not lead to a uniform distribution   public class RandomSet lt E gt  extends AbstractSet lt E gt         List lt E gt  dta   new ArrayList lt E gt         Map lt E  Integer gt  idx   new HashMap lt E  Integer gt          public RandomSet                public RandomSet Collection lt E gt  items            for  E item   items                idx put item  dta size                 dta add item                         Override     public boolean add E item            if  idx containsKey item                 return false                    idx put item  dta size             dta add item           return true                        Override element at position  lt code gt id lt  code gt  with last element          param id             public E removeAt int id            if  id  gt   dta size                  return null                    E res   dta get id           idx remove res           E last   dta remove dta size   - 1              skip filling the hole if last is removed         if  id  lt  dta size                  idx put last  id               dta set id  last                     return res              Override     public boolean remove Object item             SuppressWarnings value    element-type-mismatch           Integer id   idx get item           if  id    null                return false                    removeAt id           return true             public E get int i            return dta get i              public E pollRandom Random rnd            if  dta isEmpty                  return null                    int id   rnd nextInt dta size             return removeAt id               Override     public int size             return dta size                Override     public Iterator lt E gt  iterator             return dta iterator

User · Answer

For fun I wrote a RandomHashSet based on rejection sampling. It's a bit hacky, since HashMap doesn't let us access it's table directly, but it should work just fine.

It doesn't use any extra memory, and lookup time is O(1) amortized. (Because java HashTable is dense).

class RandomHashSet<V> extends AbstractSet<V> {
    private Map<Object,V> map = new HashMap<>();
    public boolean add(V v) {
        return map.put(new WrapKey<V>(v),v) == null;
    }
    @Override
    public Iterator<V> iterator() {
        return new Iterator<V>() {
            RandKey key = new RandKey();
            @Override public boolean hasNext() {
                return true;
            }
            @Override public V next() {
                while (true) {
                    key.next();
                    V v = map.get(key);
                    if (v != null)
                        return v;
                }
            }
            @Override public void remove() {
                throw new NotImplementedException();
            }
        };
    }
    @Override
    public int size() {
        return map.size();
    }
    static class WrapKey<V> {
        private V v;
        WrapKey(V v) {
            this.v = v;
        }
        @Override public int hashCode() {
            return v.hashCode();
        }
        @Override public boolean equals(Object o) {
            if (o instanceof RandKey)
                return true;
            return v.equals(o);
        }
    }
    static class RandKey {
        private Random rand = new Random();
        int key = rand.nextInt();
        public void next() {
            key = rand.nextInt();
        }
        @Override public int hashCode() {
            return key;
        }
        @Override public boolean equals(Object o) {
            return true;
        }
    }
}

User · Answer

List asList   new ArrayList mySet   Collections shuffle asList   return asList get 0

User · Answer

Icon has a set type and a random-element operator, unary "?", so the expression

? set( [1, 2, 3, 4, 5] )

will produce a random number between 1 and 5.

The random seed is initialized to 0 when a program is run, so to produce different results on each run use randomize()

User · Answer

Since you said  Solutions for other languages are also welcome   here s the version for Python    gt  gt  gt  import random  gt  gt  gt  random choice  1 2 3 4 5 6   3  gt  gt  gt  random choice  1 2 3 4 5 6   4

User · Answer

you can also  transfer the set to array use array  it will probably work on small scale i see the for loop in the most voted answer is O n  anyway   Object   arr   set toArray     int v    int  arr rnd nextInt arr length

User · Answer

Can t you just get the size length of the set array  generate a random number between 0 and the size length  then call the element whose index matches that number  HashSet has a  size   method  I m pretty sure   In psuedocode -  function randFromSet target    var targetLength uint   target length    var randomIndex uint   random 0 targetLength    return target randomIndex

User · Answer

PHP  assuming  set  is an array    foo   array  alpha    bravo    charlie     index   array rand  foo    val    foo  index     The Mersenne Twister functions are better but there s no MT equivalent of array rand in PHP

User · Answer

In Mathematica   a    1  2  3  4  5   a     Length a  Random          Or  in recent versions  simply   RandomChoice a      This received a down-vote  perhaps because it lacks explanation  so here one is   Random   generates a pseudorandom float between 0 and 1   This is multiplied by the length of the list and then the ceiling function is used to round up to the next integer   This index is then extracted from a   Since hash table functionality is frequently done with rules in Mathematica  and rules are stored in lists  one might use   a     Badger  - gt  5   Bird  - gt  1   Fox  - gt  3   Frog  - gt  2   Wolf  - gt  4

User · Answer

Javascript solution     function choose  set        return set Math floor Math random     set length       var set     1  2  3  4   rand   choose  set     Or alternatively   Array prototype choose   function          return this Math floor Math random     this length         1  2  3  4  choose

User · Answer

If you really just want to pick  any  object from the Set  without any guarantees on the randomness  the easiest is taking the first returned by the iterator       Set lt Integer gt  s           Iterator lt Integer gt  it   s iterator        if it hasNext             Integer i   it next               i is a  random  object from set

User · Answer

With Guava we can do a little better than Khoth s answer   public static E random Set lt E gt  set      int index   random nextInt set size      if  set instanceof ImmutableSet           ImmutableSet asList   is O 1   as is  get   on the returned list     return set asList   get index         return Iterables get set  index

User · Answer

This is identical to accepted answer  Khoth   but with the unnecessary size and i variables removed       int random   new Random   nextInt myhashSet size         for Object obj   myhashSet            if  random--    0                return obj                    Though doing away with the two aforementioned variables  the above solution still remains random because we are relying upon random  starting at a randomly selected index  to decrement itself  toward 0 over each iteration

User · Answer

Solution above speak in terms of latency but doesn't guarantee equal probability of each index being selected.
If that needs to be considered, try reservoir sampling. http://en.wikipedia.org/wiki/Reservoir_sampling.
Collections.shuffle() (as suggested by few) uses one such algorithm.

User · Answer

If you don t mind a 3rd party library  the Utils library has a IterableUtils that has a randomFrom Iterable iterable  method that will take a Set and return a random element from it  Set lt Object gt  set   new HashSet lt  gt     set add           Object random   IterableUtils randomFrom set     It is in the Maven Central Repository at    lt dependency gt     lt groupId gt com github rkumsher lt  groupId gt     lt artifactId gt utils lt  artifactId gt     lt version gt 1 3 lt  version gt   lt  dependency gt

User · Answer

int size   myHashSet size    int item   new Random   nextInt size      In real life  the Random object should be rather more shared than this int i   0  for Object obj   myhashSet        if  i    item          return obj      i

User · Answer

In Java   Set lt Integer gt  set   new LinkedHashSet lt Integer gt  3   set add 1   set add 2   set add 3    Random rand   new Random System currentTimeMillis     int   setArray    int    set toArray    for  int i   0  i  lt  10    i        System out println setArray rand nextInt set size

User · Answer

If you want to do it in Java  you should consider copying the elements into some kind of random-access collection  such as an ArrayList    Because  unless your set is small  accessing the selected element will be expensive  O n  instead of O 1     ed  list copy is also O n    Alternatively  you could look for another Set implementation that more closely matches your requirements   The ListOrderedSet from Commons Collections looks promising

User · Answer

Unfortunately  this cannot be done efficiently  better than O n   in any of the Standard Library set containers   This is odd  since it is very easy to add a randomized pick function to hash sets as well as binary sets  In a not to sparse hash set  you can try random entries  until you get a hit  For a binary tree  you can choose randomly between the left or right subtree  with a maximum of O log2  steps  I ve implemented a demo of the later below   import random  class Node      def   init   self  object           self object   object         self value   hash object          self size   1         self a   self b   None  class RandomSet      def   init   self           self top   None      def add self  object               Add any hashable object to the set              Notice  In this simple implementation you shouldn t add two                     identical items              new   Node object          if not self top  self top   new         else  self  recursiveAdd self top  new      def  recursiveAdd self  top  new           top size    1         if new value  lt  top value              if not top a  top a   new             else  self  recursiveAdd top a  new          else              if not top b  top b   new             else  self  recursiveAdd top b  new       def pickRandom self               Pick a random item in O log2  time              Does a maximum of O log2  calls to random as well              return self  recursivePickRandom self top      def  recursivePickRandom self  top           r   random randrange top size          if r    0  return top object         elif top a and r  lt   top a size  return self  recursivePickRandom top a          return self  recursivePickRandom top b   if   name         main         s   RandomSet       for i in  5 3 7 1 4 6 9 2 8 0           s add i       dists    0  10     for i in xrange 10000           dists s pickRandom       1     print dists   I got  995  975  971  995  1057  1004  966  1052  984  1001  as output  so the distribution seams good   I ve struggled with the same problem for myself  and I haven t yet decided weather the performance gain of this more efficient pick is worth the overhead of using a python based collection  I could of course refine it and translate it to C  but that is too much work for me today

User · Answer

Perl 5   hash keys    keys  hash    rand   int rand  hash keys    print  hash  hash keys  rand      Here is one way to do it

User · Answer

after reading this thread  the best i could write is   static Random random   new Random System currentTimeMillis     public static  lt T gt  T randomChoice T   choices        int index   random nextInt choices length       return choices index

User · Answer

In lisp   defun pick-random  set          nth  random  length set   set

[java] Picking a random element from a set

The answer is

Examples related to java

Examples related to algorithm

Examples related to language-agnostic

Examples related to random

Examples related to set

Tags