[python] Syntax behind sorted(key=lambda: ...)

I think all of the answers here cover the core of what the lambda function does in the context of sorted() quite nicely, however I still feel like a description that leads to an intuitive understanding is lacking, so here is my two cents.

For the sake of completeness, I'll state the obvious up front: sorted() returns a list of sorted elements and if we want to sort in a particular way or if we want to sort a complex list of elements (e.g. nested lists or a list of tuples) we can invoke the key argument.

For me, the intuitive understanding of the key argument, why it has to be callable, and the use of lambda as the (anonymous) callable function to accomplish this comes in two parts.

  1. Using lamba ultimately means you don't have to write (define) an entire function, like the one sblom provided an example of. Lambda functions are created, used, and immediately destroyed - so they don't funk up your code with more code that will only ever be used once. This, as I understand it, is the core utility of the lambda function and its applications for such roles are broad. Its syntax is purely by convention, which is in essence the nature of programmatic syntax in general. Learn the syntax and be done with it.

Lambda syntax is as follows:

lambda input_variable(s): tasty one liner

e.g.

In [1]: f00 = lambda x: x/2

In [2]: f00(10)
Out[2]: 5.0

In [3]: (lambda x: x/2)(10)
Out[3]: 5.0

In [4]: (lambda x, y: x / y)(10, 2)
Out[4]: 5.0

In [5]: (lambda: 'amazing lambda')() # func with no args!
Out[5]: 'amazing lambda'
  1. The idea behind the key argument is that it should take in a set of instructions that will essentially point the 'sorted()' function at those list elements which should used to sort by. When it says key=, what it really means is: As I iterate through the list one element at a time (i.e. for e in list), I'm going to pass the current element to the function I provide in the key argument and use that to create a transformed list which will inform me on the order of final sorted list.

Check it out:

mylist = [3,6,3,2,4,8,23]
sorted(mylist, key=WhatToSortBy)

Base example:

sorted(mylist)

[2, 3, 3, 4, 6, 8, 23] # all numbers are in order from small to large.

Example 1:

mylist = [3,6,3,2,4,8,23]
sorted(mylist, key=lambda x: x%2==0)

[3, 3, 23, 6, 2, 4, 8] # Does this sorted result make intuitive sense to you?

Notice that my lambda function told sorted to check if (e) was even or odd before sorting.

BUT WAIT! You may (or perhaps should) be wondering two things - first, why are my odds coming before my evens (since my key value seems to be telling my sorted function to prioritize evens by using the mod operator in x%2==0). Second, why are my evens out of order? 2 comes before 6 right? By analyzing this result, we'll learn something deeper about how the sorted() 'key' argument works, especially in conjunction with the anonymous lambda function.

Firstly, you'll notice that while the odds come before the evens, the evens themselves are not sorted. Why is this?? Lets read the docs:

Key Functions Starting with Python 2.4, both list.sort() and sorted() added a key parameter to specify a function to be called on each list element prior to making comparisons.

We have to do a little bit of reading between the lines here, but what this tells us is that the sort function is only called once, and if we specify the key argument, then we sort by the value that key function points us to.

So what does the example using a modulo return? A boolean value: True == 1, False == 0. So how does sorted deal with this key? It basically transforms the original list to a sequence of 1s and 0s.

[3,6,3,2,4,8,23] becomes [0,1,0,1,1,1,0]

Now we're getting somewhere. What do you get when you sort the transformed list?

[0,0,0,1,1,1,1]

Okay, so now we know why the odds come before the evens. But the next question is: Why does the 6 still come before the 2 in my final list? Well that's easy - its because sorting only happens once! i.e. Those 1s still represent the original list values, which are in their original positions relative to each other. Since sorting only happens once, and we don't call any kind of sort function to order the original even values from low to high, those values remain in their original order relative to one another.

The final question is then this: How do I think conceptually about how the order of my boolean values get transformed back in to the original values when I print out the final sorted list?

Sorted() is a built-in method that (fun fact) uses a hybrid sorting algorithm called Timsort that combines aspects of merge sort and insertion sort. It seems clear to me that when you call it, there is a mechanic that holds these values in memory and bundles them with their boolean identity (mask) determined by (...!) the lambda function. The order is determined by their boolean identity calculated from the lambda function, but keep in mind that these sublists (of one's and zeros) are not themselves sorted by their original values. Hence, the final list, while organized by Odds and Evens, is not sorted by sublist (the evens in this case are out of order). The fact that the odds are ordered is because they were already in order by coincidence in the original list. The takeaway from all this is that when lambda does that transformation, the original order of the sublists are retained.

So how does this all relate back to the original question, and more importantly, our intuition on how we should implement sorted() with its key argument and lambda?

That lambda function can be thought of as a pointer that points to the values we need to sort by, whether its a pointer mapping a value to its boolean transformed by the lambda function, or if its a particular element in a nested list, tuple, dict, etc., again determined by the lambda function.

Lets try and predict what happens when I run the following code.

mylist = [(3, 5, 8), (6, 2, 8), ( 2, 9, 4), (6, 8, 5)]
sorted(mylist, key=lambda x: x[1])

My sorted call obviously says, "Please sort this list". The key argument makes that a little more specific by saying, for each element (x) in mylist, return index 1 of that element, then sort all of the elements of the original list 'mylist' by the sorted order of the list calculated by the lambda function. Since we have a list of tuples, we can return an indexed element from that tuple. So we get:

[(6, 2, 8), (3, 5, 8), (6, 8, 5), (2, 9, 4)]

Run that code, and you'll find that this is the order. Try indexing a list of integers and you'll find that the code breaks.

This was a long winded explanation, but I hope this helps to 'sort' your intuition on the use of lambda functions as the key argument in sorted() and beyond.

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to sorting

Sort Array of object by object field in Angular 6 Sorting a list with stream.sorted() in Java How to sort dates from Oldest to Newest in Excel? how to sort pandas dataframe from one column Reverse a comparator in Java 8 Find the unique values in a column and then sort them pandas groupby sort within groups pandas groupby sort descending order Efficiently sorting a numpy array in descending order? Swift: Sort array of objects alphabetically

Examples related to lambda

Java 8 optional: ifPresent return object orElseThrow exception How to properly apply a lambda function into a pandas data frame column What are functional interfaces used for in Java 8? Java 8 lambda get and remove element from list Variable used in lambda expression should be final or effectively final Filter values only if not null using lambda in Java8 forEach loop Java 8 for Map entry set Java 8 Lambda Stream forEach with multiple statements Java 8 stream map to list of keys sorted by values Task.Run with Parameter(s)?

Examples related to custom-function

Syntax behind sorted(key=lambda: ...) Selecting the last value of a column