[python] Understanding the set() function

In python, set() is an unordered collection with no duplicate elements. However, I am not able to understand how it generates the output.

For example, consider the following:

>>> x = [1, 1, 2, 2, 2, 2, 2, 3, 3]
>>> set(x)
set([1, 2, 3])

>>> y = [1, 1, 6, 6, 6, 6, 6, 8, 8]
>>> set(y)
set([8, 1, 6])

>>> z = [1, 1, 6, 6, 6, 6, 6, 7, 7]
>>> set(z)
set([1, 6, 7])

Shouldn't the output of set(y) be: set([1, 6, 8])? I tried the above two in Python 2.6.

This question is related to python set

The answer is


After reading the other answers, I still had trouble understanding why the set comes out un-ordered.

Mentioned this to my partner and he came up with this metaphor: take marbles. You put them in a tube a tad wider than marble width : you have a list. A set, however, is a bag. Even though you feed the marbles one-by-one into the bag; when you pour them from a bag back into the tube, they will not be in the same order (because they got all mixed up in a bag).


As an unordered collection type, set([8, 1, 6]) is equivalent to set([1, 6, 8]).

While it might be nicer to display the set contents in sorted order, that would make the repr() call more expensive.

Internally, the set type is implemented using a hash table: a hash function is used to separate items into a number of buckets to reduce the number of equality operations needed to check if an item is part of the set.

To produce the repr() output it just outputs the items from each bucket in turn, which is unlikely to be the sorted order.


Python's sets (and dictionaries) will iterate and print out in some order, but exactly what that order will be is arbitrary, and not guaranteed to remain the same after additions and removals.

Here's an example of a set changing order after a lot of values are added and then removed:

>>> s = set([1,6,8])
>>> print(s)
{8, 1, 6}
>>> s.update(range(10,100000))
>>> for v in range(10, 100000):
    s.remove(v)
>>> print(s)
{1, 6, 8}

This is implementation dependent though, and so you should not rely upon it.


As +Volatility and yourself pointed out, sets are unordered. If you need the elements to be in order, just call sorted on the set:

>>> y = [1, 1, 6, 6, 6, 6, 6, 8, 8]
>>> sorted(set(y))
[1, 6, 8]