I've got a list of Python objects that I'd like to sort by an attribute of the objects themselves. The list looks like:
>>> ut
[<Tag: 128>, <Tag: 2008>, <Tag: <>, <Tag: actionscript>, <Tag: addresses>,
<Tag: aes>, <Tag: ajax> ...]
Each object has a count:
>>> ut[1].count
1L
I need to sort the list by number of counts descending.
I've seen several methods for this, but I'm looking for best practice in Python.
It looks much like a list of Django ORM model instances.
Why not sort them on query like this:
ut = Tag.objects.order_by('-count')
Add rich comparison operators to the object class, then use sort() method of the list.
See rich comparison in python.
Update: Although this method would work, I think solution from Triptych is better suited to your case because way simpler.
Object-oriented approach
It's good practice to make object sorting logic, if applicable, a property of the class rather than incorporated in each instance the ordering is required.
This ensures consistency and removes the need for boilerplate code.
At a minimum, you should specify __eq__
and __lt__
operations for this to work. Then just use sorted(list_of_objects)
.
class Card(object):
def __init__(self, rank, suit):
self.rank = rank
self.suit = suit
def __eq__(self, other):
return self.rank == other.rank and self.suit == other.suit
def __lt__(self, other):
return self.rank < other.rank
hand = [Card(10, 'H'), Card(2, 'h'), Card(12, 'h'), Card(13, 'h'), Card(14, 'h')]
hand_order = [c.rank for c in hand] # [10, 2, 12, 13, 14]
hand_sorted = sorted(hand)
hand_sorted_order = [c.rank for c in hand_sorted] # [2, 10, 12, 13, 14]
Readers should notice that the key= method:
ut.sort(key=lambda x: x.count, reverse=True)
is many times faster than adding rich comparison operators to the objects. I was surprised to read this (page 485 of "Python in a Nutshell"). You can confirm this by running tests on this little program:
#!/usr/bin/env python
import random
class C:
def __init__(self,count):
self.count = count
def __cmp__(self,other):
return cmp(self.count,other.count)
longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs
longList2 = longList[:]
longList.sort() #about 52 - 6.1 = 46 secs
longList2.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs
My, very minimal, tests show the first sort is more than 10 times slower, but the book says it is only about 5 times slower in general. The reason they say is due to the highly optimizes sort algorithm used in python (timsort).
Still, its very odd that .sort(lambda) is faster than plain old .sort(). I hope they fix that.
from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)
If the attribute you want to sort by is a property, then you can avoid importing operator.attrgetter
and use the property's fget
method instead.
For example, for a class Circle
with a property radius
we could sort a list of circles
by radii as follows:
result = sorted(circles, key=Circle.radius.fget)
This is not the most well-known feature but often saves me a line with the import.
from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)
It looks much like a list of Django ORM model instances.
Why not sort them on query like this:
ut = Tag.objects.order_by('-count')
If the attribute you want to sort by is a property, then you can avoid importing operator.attrgetter
and use the property's fget
method instead.
For example, for a class Circle
with a property radius
we could sort a list of circles
by radii as follows:
result = sorted(circles, key=Circle.radius.fget)
This is not the most well-known feature but often saves me a line with the import.
It looks much like a list of Django ORM model instances.
Why not sort them on query like this:
ut = Tag.objects.order_by('-count')
A way that can be fastest, especially if your list has a lot of records, is to use operator.attrgetter("count")
. However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:
try: import operator
except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module
else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda
ut.sort(key=keyfun, reverse=True) # sort in-place
Add rich comparison operators to the object class, then use sort() method of the list.
See rich comparison in python.
Update: Although this method would work, I think solution from Triptych is better suited to your case because way simpler.
Readers should notice that the key= method:
ut.sort(key=lambda x: x.count, reverse=True)
is many times faster than adding rich comparison operators to the objects. I was surprised to read this (page 485 of "Python in a Nutshell"). You can confirm this by running tests on this little program:
#!/usr/bin/env python
import random
class C:
def __init__(self,count):
self.count = count
def __cmp__(self,other):
return cmp(self.count,other.count)
longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs
longList2 = longList[:]
longList.sort() #about 52 - 6.1 = 46 secs
longList2.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs
My, very minimal, tests show the first sort is more than 10 times slower, but the book says it is only about 5 times slower in general. The reason they say is due to the highly optimizes sort algorithm used in python (timsort).
Still, its very odd that .sort(lambda) is faster than plain old .sort(). I hope they fix that.
A way that can be fastest, especially if your list has a lot of records, is to use operator.attrgetter("count")
. However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:
try: import operator
except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module
else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda
ut.sort(key=keyfun, reverse=True) # sort in-place
from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)
A way that can be fastest, especially if your list has a lot of records, is to use operator.attrgetter("count")
. However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:
try: import operator
except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module
else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda
ut.sort(key=keyfun, reverse=True) # sort in-place
Object-oriented approach
It's good practice to make object sorting logic, if applicable, a property of the class rather than incorporated in each instance the ordering is required.
This ensures consistency and removes the need for boilerplate code.
At a minimum, you should specify __eq__
and __lt__
operations for this to work. Then just use sorted(list_of_objects)
.
class Card(object):
def __init__(self, rank, suit):
self.rank = rank
self.suit = suit
def __eq__(self, other):
return self.rank == other.rank and self.suit == other.suit
def __lt__(self, other):
return self.rank < other.rank
hand = [Card(10, 'H'), Card(2, 'h'), Card(12, 'h'), Card(13, 'h'), Card(14, 'h')]
hand_order = [c.rank for c in hand] # [10, 2, 12, 13, 14]
hand_sorted = sorted(hand)
hand_sorted_order = [c.rank for c in hand_sorted] # [2, 10, 12, 13, 14]
Add rich comparison operators to the object class, then use sort() method of the list.
See rich comparison in python.
Update: Although this method would work, I think solution from Triptych is better suited to your case because way simpler.
It looks much like a list of Django ORM model instances.
Why not sort them on query like this:
ut = Tag.objects.order_by('-count')
Source: Stackoverflow.com