[python] In practice, what are the main uses for the new "yield from" syntax in Python 3.3?

I'm having a hard time wrapping my brain around PEP 380.

  1. What are the situations where "yield from" is useful?
  2. What is the classic use case?
  3. Why is it compared to micro-threads?

[ update ]

Now I understand the cause of my difficulties. I've used generators, but never really used coroutines (introduced by PEP-342). Despite some similarities, generators and coroutines are basically two different concepts. Understanding coroutines (not only generators) is the key to understanding the new syntax.

IMHO coroutines are the most obscure Python feature, most books make it look useless and uninteresting.

Thanks for the great answers, but special thanks to agf and his comment linking to David Beazley presentations. David rocks.

This question is related to python yield

The answer is


This code defines a function fixed_sum_digits returning a generator enumerating all six digits numbers such that the sum of digits is 20.

def iter_fun(sum, deepness, myString, Total):
    if deepness == 0:
        if sum == Total:
            yield myString
    else:  
        for i in range(min(10, Total - sum + 1)):
            yield from iter_fun(sum + i,deepness - 1,myString + str(i),Total)

def fixed_sum_digits(digits, Tot):
    return iter_fun(0,digits,"",Tot) 

Try to write it without yield from. If you find an effective way to do it let me know.

I think that for cases like this one: visiting trees, yield from makes the code simpler and cleaner.


What are the situations where "yield from" is useful?

Every situation where you have a loop like this:

for x in subgenerator:
  yield x

As the PEP describes, this is a rather naive attempt at using the subgenerator, it's missing several aspects, especially the proper handling of the .throw()/.send()/.close() mechanisms introduced by PEP 342. To do this properly, rather complicated code is necessary.

What is the classic use case?

Consider that you want to extract information from a recursive data structure. Let's say we want to get all leaf nodes in a tree:

def traverse_tree(node):
  if not node.children:
    yield node
  for child in node.children:
    yield from traverse_tree(child)

Even more important is the fact that until the yield from, there was no simple method of refactoring the generator code. Suppose you have a (senseless) generator like this:

def get_list_values(lst):
  for item in lst:
    yield int(item)
  for item in lst:
    yield str(item)
  for item in lst:
    yield float(item)

Now you decide to factor out these loops into separate generators. Without yield from, this is ugly, up to the point where you will think twice whether you actually want to do it. With yield from, it's actually nice to look at:

def get_list_values(lst):
  for sub in [get_list_values_as_int, 
              get_list_values_as_str, 
              get_list_values_as_float]:
    yield from sub(lst)

Why is it compared to micro-threads?

I think what this section in the PEP is talking about is that every generator does have its own isolated execution context. Together with the fact that execution is switched between the generator-iterator and the caller using yield and __next__(), respectively, this is similar to threads, where the operating system switches the executing thread from time to time, along with the execution context (stack, registers, ...).

The effect of this is also comparable: Both the generator-iterator and the caller progress in their execution state at the same time, their executions are interleaved. For example, if the generator does some kind of computation and the caller prints out the results, you'll see the results as soon as they're available. This is a form of concurrency.

That analogy isn't anything specific to yield from, though - it's rather a general property of generators in Python.


A short example will help you understand one of yield from's use case: get value from another generator

def flatten(sequence):
    """flatten a multi level list or something
    >>> list(flatten([1, [2], 3]))
    [1, 2, 3]
    >>> list(flatten([1, [2], [3, [4]]]))
    [1, 2, 3, 4]
    """
    for element in sequence:
        if hasattr(element, '__iter__'):
            yield from flatten(element)
        else:
            yield element

print(list(flatten([1, [2], [3, [4]]])))

In applied usage for the Asynchronous IO coroutine, yield from has a similar behavior as await in a coroutine function. Both of which is used to suspend the execution of coroutine.

For Asyncio, if there's no need to support an older Python version (i.e. >3.5), async def/await is the recommended syntax to define a coroutine. Thus yield from is no longer needed in a coroutine.

But in general outside of asyncio, yield from <sub-generator> has still some other usage in iterating the sub-generator as mentioned in the earlier answer.


Simply put, yield from provides tail recursion for iterator functions.


yield from basically chains iterators in a efficient way:

# chain from itertools:
def chain(*iters):
    for it in iters:
        for item in it:
            yield item

# with the new keyword
def chain(*iters):
    for it in iters:
        yield from it

As you can see it removes one pure Python loop. That's pretty much all it does, but chaining iterators is a pretty common pattern in Python.

Threads are basically a feature that allow you to jump out of functions at completely random points and jump back into the state of another function. The thread supervisor does this very often, so the program appears to run all these functions at the same time. The problem is that the points are random, so you need to use locking to prevent the supervisor from stopping the function at a problematic point.

Generators are pretty similar to threads in this sense: They allow you to specify specific points (whenever they yield) where you can jump in and out. When used this way, generators are called coroutines.

Read this excellent tutorials about coroutines in Python for more details


Wherever you invoke a generator from within a generator you need a "pump" to re-yield the values: for v in inner_generator: yield v. As the PEP points out there are subtle complexities to this which most people ignore. Non-local flow-control like throw() is one example given in the PEP. The new syntax yield from inner_generator is used wherever you would have written the explicit for loop before. It's not merely syntactic sugar, though: It handles all of the corner cases that are ignored by the for loop. Being "sugary" encourages people to use it and thus get the right behaviors.

This message in the discussion thread talks about these complexities:

With the additional generator features introduced by PEP 342, that is no longer the case: as described in Greg's PEP, simple iteration doesn't support send() and throw() correctly. The gymnastics needed to support send() and throw() actually aren't that complex when you break them down, but they aren't trivial either.

I can't speak to a comparison with micro-threads, other than to observe that generators are a type of paralellism. You can consider the suspended generator to be a thread which sends values via yield to a consumer thread. The actual implementation may be nothing like this (and the actual implementation is obviously of great interest to the Python developers) but this does not concern the users.

The new yield from syntax does not add any additional capability to the language in terms of threading, it just makes it easier to use existing features correctly. Or more precisely it makes it easier for a novice consumer of a complex inner generator written by an expert to pass through that generator without breaking any of its complex features.