There's an existing function that ends in the following, where d
is a dictionary:
return d.iteritems()
that returns an unsorted iterator for a given dictionary. I would like to return an iterator that goes through the items sorted by key. How do I do that?
This question is related to
python
sorting
dictionary
If you want to sort by the order that items were inserted instead of of the order of the keys, you should have a look to Python's collections.OrderedDict. (Python 3 only)
>>> import heapq
>>> d = {"c": 2, "b": 9, "a": 4, "d": 8}
>>> def iter_sorted(d):
keys = list(d)
heapq.heapify(keys) # Transforms to heap in O(N) time
while keys:
k = heapq.heappop(keys) # takes O(log n) time
yield (k, d[k])
>>> i = iter_sorted(d)
>>> for x in i:
print x
('a', 4)
('b', 9)
('c', 2)
('d', 8)
This method still has an O(N log N) sort, however, after a short linear heapify, it yields the items in sorted order as it goes, making it theoretically more efficient when you do not always need the whole list.
In general, one may sort a dict like so:
for k in sorted(d):
print k, d[k]
For the specific case in the question, having a "drop in replacement" for d.iteritems(), add a function like:
def sortdict(d, **opts):
# **opts so any currently supported sorted() options can be passed
for k in sorted(d, **opts):
yield k, d[k]
and so the ending line changes from
return dict.iteritems()
to
return sortdict(dict)
or
return sortdict(dict, reverse = True)
You can now use OrderedDict
in Python 2.7 as well:
>>> from collections import OrderedDict
>>> d = OrderedDict([('first', 1),
... ('second', 2),
... ('third', 3)])
>>> d.items()
[('first', 1), ('second', 2), ('third', 3)]
Here you have the what's new page for 2.7 version and the OrderedDict API.
If you want to sort by the order that items were inserted instead of of the order of the keys, you should have a look to Python's collections.OrderedDict. (Python 3 only)
Use the sorted()
function:
return sorted(dict.iteritems())
If you want an actual iterator over the sorted results, since sorted()
returns a list, use:
return iter(sorted(dict.iteritems()))
A dict's keys are stored in a hashtable so that is their 'natural order', i.e. psuedo-random. Any other ordering is a concept of the consumer of the dict.
sorted() always returns a list, not a dict. If you pass it a dict.items() (which produces a list of tuples), it will return a list of tuples [(k1,v1), (k2,v2), ...] which can be used in a loop in a way very much like a dict, but it is not in anyway a dict!
foo = {
'a': 1,
'b': 2,
'c': 3,
}
print foo
>>> {'a': 1, 'c': 3, 'b': 2}
print foo.items()
>>> [('a', 1), ('c', 3), ('b', 2)]
print sorted(foo.items())
>>> [('a', 1), ('b', 2), ('c', 3)]
The following feels like a dict in a loop, but it's not, it's a list of tuples being unpacked into k,v:
for k,v in sorted(foo.items()):
print k, v
Roughly equivalent to:
for k in sorted(foo.keys()):
print k, foo[k]
Greg's answer is right. Note that in Python 3.0 you'll have to do
sorted(dict.items())
as iteritems
will be gone.
You can now use OrderedDict
in Python 2.7 as well:
>>> from collections import OrderedDict
>>> d = OrderedDict([('first', 1),
... ('second', 2),
... ('third', 3)])
>>> d.items()
[('first', 1), ('second', 2), ('third', 3)]
Here you have the what's new page for 2.7 version and the OrderedDict API.
Greg's answer is right. Note that in Python 3.0 you'll have to do
sorted(dict.items())
as iteritems
will be gone.
sorted returns a list, hence your error when you try to iterate over it, but because you can't order a dict you will have to deal with a list.
I have no idea what the larger context of your code is, but you could try adding an iterator to the resulting list. like this maybe?:
return iter(sorted(dict.iteritems()))
of course you will be getting back tuples now because sorted turned your dict into a list of tuples
ex:
say your dict was:
{'a':1,'c':3,'b':2}
sorted turns it into a list:
[('a',1),('b',2),('c',3)]
so when you actually iterate over the list you get back (in this example) a tuple composed of a string and an integer, but at least you will be able to iterate over it.
A dict's keys are stored in a hashtable so that is their 'natural order', i.e. psuedo-random. Any other ordering is a concept of the consumer of the dict.
sorted() always returns a list, not a dict. If you pass it a dict.items() (which produces a list of tuples), it will return a list of tuples [(k1,v1), (k2,v2), ...] which can be used in a loop in a way very much like a dict, but it is not in anyway a dict!
foo = {
'a': 1,
'b': 2,
'c': 3,
}
print foo
>>> {'a': 1, 'c': 3, 'b': 2}
print foo.items()
>>> [('a', 1), ('c', 3), ('b', 2)]
print sorted(foo.items())
>>> [('a', 1), ('b', 2), ('c', 3)]
The following feels like a dict in a loop, but it's not, it's a list of tuples being unpacked into k,v:
for k,v in sorted(foo.items()):
print k, v
Roughly equivalent to:
for k in sorted(foo.keys()):
print k, foo[k]
sorted returns a list, hence your error when you try to iterate over it, but because you can't order a dict you will have to deal with a list.
I have no idea what the larger context of your code is, but you could try adding an iterator to the resulting list. like this maybe?:
return iter(sorted(dict.iteritems()))
of course you will be getting back tuples now because sorted turned your dict into a list of tuples
ex:
say your dict was:
{'a':1,'c':3,'b':2}
sorted turns it into a list:
[('a',1),('b',2),('c',3)]
so when you actually iterate over the list you get back (in this example) a tuple composed of a string and an integer, but at least you will be able to iterate over it.
Assuming you are using CPython 2.x and have a large dictionary mydict, then using sorted(mydict) is going to be slow because sorted builds a sorted list of the keys of mydict.
In that case you might want to look at my ordereddict package which includes a C implementation of sorteddict
in C. Especially if you have to go over the sorted list of keys multiple times at different stages (ie. number of elements) of the dictionaries lifetime.
Use the sorted()
function:
return sorted(dict.iteritems())
If you want an actual iterator over the sorted results, since sorted()
returns a list, use:
return iter(sorted(dict.iteritems()))
>>> import heapq
>>> d = {"c": 2, "b": 9, "a": 4, "d": 8}
>>> def iter_sorted(d):
keys = list(d)
heapq.heapify(keys) # Transforms to heap in O(N) time
while keys:
k = heapq.heappop(keys) # takes O(log n) time
yield (k, d[k])
>>> i = iter_sorted(d)
>>> for x in i:
print x
('a', 4)
('b', 9)
('c', 2)
('d', 8)
This method still has an O(N log N) sort, however, after a short linear heapify, it yields the items in sorted order as it goes, making it theoretically more efficient when you do not always need the whole list.
A dict's keys are stored in a hashtable so that is their 'natural order', i.e. psuedo-random. Any other ordering is a concept of the consumer of the dict.
sorted() always returns a list, not a dict. If you pass it a dict.items() (which produces a list of tuples), it will return a list of tuples [(k1,v1), (k2,v2), ...] which can be used in a loop in a way very much like a dict, but it is not in anyway a dict!
foo = {
'a': 1,
'b': 2,
'c': 3,
}
print foo
>>> {'a': 1, 'c': 3, 'b': 2}
print foo.items()
>>> [('a', 1), ('c', 3), ('b', 2)]
print sorted(foo.items())
>>> [('a', 1), ('b', 2), ('c', 3)]
The following feels like a dict in a loop, but it's not, it's a list of tuples being unpacked into k,v:
for k,v in sorted(foo.items()):
print k, v
Roughly equivalent to:
for k in sorted(foo.keys()):
print k, foo[k]
Assuming you are using CPython 2.x and have a large dictionary mydict, then using sorted(mydict) is going to be slow because sorted builds a sorted list of the keys of mydict.
In that case you might want to look at my ordereddict package which includes a C implementation of sorteddict
in C. Especially if you have to go over the sorted list of keys multiple times at different stages (ie. number of elements) of the dictionaries lifetime.
Greg's answer is right. Note that in Python 3.0 you'll have to do
sorted(dict.items())
as iteritems
will be gone.
A dict's keys are stored in a hashtable so that is their 'natural order', i.e. psuedo-random. Any other ordering is a concept of the consumer of the dict.
sorted() always returns a list, not a dict. If you pass it a dict.items() (which produces a list of tuples), it will return a list of tuples [(k1,v1), (k2,v2), ...] which can be used in a loop in a way very much like a dict, but it is not in anyway a dict!
foo = {
'a': 1,
'b': 2,
'c': 3,
}
print foo
>>> {'a': 1, 'c': 3, 'b': 2}
print foo.items()
>>> [('a', 1), ('c', 3), ('b', 2)]
print sorted(foo.items())
>>> [('a', 1), ('b', 2), ('c', 3)]
The following feels like a dict in a loop, but it's not, it's a list of tuples being unpacked into k,v:
for k,v in sorted(foo.items()):
print k, v
Roughly equivalent to:
for k in sorted(foo.keys()):
print k, foo[k]
sorted returns a list, hence your error when you try to iterate over it, but because you can't order a dict you will have to deal with a list.
I have no idea what the larger context of your code is, but you could try adding an iterator to the resulting list. like this maybe?:
return iter(sorted(dict.iteritems()))
of course you will be getting back tuples now because sorted turned your dict into a list of tuples
ex:
say your dict was:
{'a':1,'c':3,'b':2}
sorted turns it into a list:
[('a',1),('b',2),('c',3)]
so when you actually iterate over the list you get back (in this example) a tuple composed of a string and an integer, but at least you will be able to iterate over it.
In general, one may sort a dict like so:
for k in sorted(d):
print k, d[k]
For the specific case in the question, having a "drop in replacement" for d.iteritems(), add a function like:
def sortdict(d, **opts):
# **opts so any currently supported sorted() options can be passed
for k in sorted(d, **opts):
yield k, d[k]
and so the ending line changes from
return dict.iteritems()
to
return sortdict(dict)
or
return sortdict(dict, reverse = True)
Greg's answer is right. Note that in Python 3.0 you'll have to do
sorted(dict.items())
as iteritems
will be gone.
Source: Stackoverflow.com