[php] What does yield mean in PHP?

I've recently stumbled over this code:

function xrange($min, $max) 
{
    for ($i = $min; $i <= $max; $i++) {
        yield $i;
    }
}

I've never seen this yield keyword before. Trying to run the code I get

Parse error: syntax error, unexpected T_VARIABLE on line x

So what is this yield keyword? Is it even valid PHP? And if it is, how do I use it?

This question is related to php generator php-5.5 yield-keyword

The answer is


An interesting aspect, which worth to be discussed here, is yielding by reference. Every time we need to change a parameter such that it is reflected outside of the function, we have to pass this parameter by reference. To apply this to generators, we simply prepend an ampersand & to the name of the generator and to the variable used in the iteration:

 <?php 
 /**
 * Yields by reference.
 * @param int $from
 */
function &counter($from) {
    while ($from > 0) {
        yield $from;
    }
}

foreach (counter(100) as &$value) {
    $value--;
    echo $value . '...';
}

// Output: 99...98...97...96...95...

The above example shows how changing the iterated values within the foreach loop changes the $from variable within the generator. This is because $from is yielded by reference due to the ampersand before the generator name. Because of that, the $value variable within the foreach loop is a reference to the $from variable within the generator function.


simple example

<?php
echo '#start main# ';
function a(){
    echo '{start[';
    for($i=1; $i<=9; $i++)
        yield $i;
    echo ']end} ';
}
foreach(a() as $v)
    echo $v.',';
echo '#end main#';
?>

output

#start main# {start[1,2,3,4,5,6,7,8,9,]end} #end main#

advanced example

<?php
echo '#start main# ';
function a(){
    echo '{start[';
    for($i=1; $i<=9; $i++)
        yield $i;
    echo ']end} ';
}
foreach(a() as $k => $v){
    if($k === 5)
        break;
    echo $k.'=>'.$v.',';
}
echo '#end main#';
?>

output

#start main# {start[0=>1,1=>2,2=>3,3=>4,4=>5,#end main#

This function is using yield:

function a($items) {
    foreach ($items as $item) {
        yield $item + 1;
    }
}

It is almost the same as this one without:

function b($items) {
    $result = [];
    foreach ($items as $item) {
        $result[] = $item + 1;
    }
    return $result;
}

The only one difference is that a() returns a generator and b() just a simple array. You can iterate on both.

Also, the first one does not allocate a full array and is therefore less memory-demanding.


yield keyword serves for definition of "generators" in PHP 5.5. Ok, then what is a generator?

From php.net:

Generators provide an easy way to implement simple iterators without the overhead or complexity of implementing a class that implements the Iterator interface.

A generator allows you to write code that uses foreach to iterate over a set of data without needing to build an array in memory, which may cause you to exceed a memory limit, or require a considerable amount of processing time to generate. Instead, you can write a generator function, which is the same as a normal function, except that instead of returning once, a generator can yield as many times as it needs to in order to provide the values to be iterated over.

From this place: generators = generators, other functions (just a simple functions) = functions.

So, they are useful when:

  • you need to do things simple (or simple things);

    generator is really much simplier then implementing the Iterator interface. other hand is, ofcource, that generators are less functional. compare them.

  • you need to generate BIG amounts of data - saving memory;

    actually to save memory we can just generate needed data via functions for every loop iteration, and after iteration utilize garbage. so here main points is - clear code and probably performance. see what is better for your needs.

  • you need to generate sequence, which depends on intermediate values;

    this is extending of the previous thought. generators can make things easier in comparison with functions. check Fibonacci example, and try to make sequence without generator. Also generators can work faster is this case, at least because of storing intermediate values in local variables;

  • you need to improve performance.

    they can work faster then functions in some cases (see previous benefit);


None of the answers above show a concrete example using massive arrays populated by non-numeric members. Here is an example using an array generated by explode() on a large .txt file (262MB in my use case):

<?php

ini_set('memory_limit','1000M');

echo "Starting memory usage: " . memory_get_usage() . "<br>";

$path = './file.txt';
$content = file_get_contents($path);

foreach(explode("\n", $content) as $ex) {
    $ex = trim($ex);
}

echo "Final memory usage: " . memory_get_usage();

The output was:

Starting memory usage: 415160
Final memory usage: 270948256

Now compare that to a similar script, using the yield keyword:

<?php

ini_set('memory_limit','1000M');

echo "Starting memory usage: " . memory_get_usage() . "<br>";

function x() {
    $path = './file.txt';
    $content = file_get_contents($path);
    foreach(explode("\n", $content) as $x) {
        yield $x;
    }
}

foreach(x() as $ex) {
    $ex = trim($ex);
}

echo "Final memory usage: " . memory_get_usage();

The output for this script was:

Starting memory usage: 415152
Final memory usage: 415616

Clearly memory usage savings were considerable (?MemoryUsage -----> ~270.5 MB in first example, ~450B in second example).


With yield you can easily describe the breakpoints between multiple tasks in a single function. That's all, there is nothing special about it.

$closure = function ($injected1, $injected2, ...){
    $returned = array();
    //task1 on $injected1
    $returned[] = $returned1;
//I need a breakpoint here!!!!!!!!!!!!!!!!!!!!!!!!!
    //task2 on $injected2
    $returned[] = $returned2;
    //...
    return $returned;
};
$returned = $closure($injected1, $injected2, ...);

If task1 and task2 are highly related, but you need a breakpoint between them to do something else:

  • free memory between processing database rows
  • run other tasks which provide dependency to the next task, but which are unrelated by understanding the current code
  • doing async calls and wait for the results
  • and so on ...

then generators are the best solution, because you don't have to split up your code into many closures or mix it with other code, or use callbacks, etc... You just use yield to add a breakpoint, and you can continue from that breakpoint if you are ready.

Add breakpoint without generators:

$closure1 = function ($injected1){
    //task1 on $injected1
    return $returned1;
};
$closure2 = function ($injected2){
    //task2 on $injected2
    return $returned1;
};
//...
$returned1 = $closure1($injected1);
//breakpoint between task1 and task2
$returned2 = $closure2($injected2);
//...

Add breakpoint with generators

$closure = function (){
    $injected1 = yield;
    //task1 on $injected1
    $injected2 = (yield($returned1));
    //task2 on $injected2
    $injected3 = (yield($returned2));
    //...
    yield($returnedN);
};
$generator = $closure();
$returned1 = $generator->send($injected1);
//breakpoint between task1 and task2
$returned2 = $generator->send($injected2);
//...
$returnedN = $generator->send($injectedN);

note: It is easy to make mistake with generators, so always write unit tests before you implement them! note2: Using generators in an infinite loop is like writing a closure which has infinite length...


The below code illustrates how using a generator returns a result before completion, unlike the traditional non generator approach that returns a complete array after full iteration. With the generator below, the values are returned when ready, no need to wait for an array to be completely filled:

<?php 

function sleepiterate($length) {
    for ($i=0; $i < $length; $i++) {
        sleep(2);
        yield $i;
    }
}

foreach (sleepiterate(5) as $i) {
    echo $i, PHP_EOL;
}