Let me prefix this by saying that I know what foreach
is, does and how to use it. This question concerns how it works under the bonnet, and I don't want any answers along the lines of "this is how you loop an array with foreach
".
For a long time I assumed that foreach
worked with the array itself. Then I found many references to the fact that it works with a copy of the array, and I have since assumed this to be the end of the story. But I recently got into a discussion on the matter, and after a little experimentation found that this was not in fact 100% true.
Let me show what I mean. For the following test cases, we will be working with the following array:
$array = array(1, 2, 3, 4, 5);
foreach ($array as $item) {
echo "$item\n";
$array[] = $item;
}
print_r($array);
/* Output in loop: 1 2 3 4 5
$array after loop: 1 2 3 4 5 1 2 3 4 5 */
This clearly shows that we are not working directly with the source array - otherwise the loop would continue forever, since we are constantly pushing items onto the array during the loop. But just to be sure this is the case:
foreach ($array as $key => $item) {
$array[$key + 1] = $item + 2;
echo "$item\n";
}
print_r($array);
/* Output in loop: 1 2 3 4 5
$array after loop: 1 3 4 5 6 7 */
This backs up our initial conclusion, we are working with a copy of the source array during the loop, otherwise we would see the modified values during the loop. But...
If we look in the manual, we find this statement:
When foreach first starts executing, the internal array pointer is automatically reset to the first element of the array.
Right... this seems to suggest that foreach
relies on the array pointer of the source array. But we've just proved that we're not working with the source array, right? Well, not entirely.
// Move the array pointer on one to make sure it doesn't affect the loop
var_dump(each($array));
foreach ($array as $item) {
echo "$item\n";
}
var_dump(each($array));
/* Output
array(4) {
[1]=>
int(1)
["value"]=>
int(1)
[0]=>
int(0)
["key"]=>
int(0)
}
1
2
3
4
5
bool(false)
*/
So, despite the fact that we are not working directly with the source array, we are working directly with the source array pointer - the fact that the pointer is at the end of the array at the end of the loop shows this. Except this can't be true - if it was, then test case 1 would loop forever.
The PHP manual also states:
As foreach relies on the internal array pointer changing it within the loop may lead to unexpected behavior.
Well, let's find out what that "unexpected behavior" is (technically, any behavior is unexpected since I no longer know what to expect).
foreach ($array as $key => $item) {
echo "$item\n";
each($array);
}
/* Output: 1 2 3 4 5 */
foreach ($array as $key => $item) {
echo "$item\n";
reset($array);
}
/* Output: 1 2 3 4 5 */
...nothing that unexpected there, in fact it seems to support the "copy of source" theory.
The Question
What is going on here? My C-fu is not good enough for me to able to extract a proper conclusion simply by looking at the PHP source code, I would appreciate it if someone could translate it into English for me.
It seems to me that foreach
works with a copy of the array, but sets the array pointer of the source array to the end of the array after the loop.
each()
, reset()
et al.) during a foreach
could affect the outcome of the loop?This question is related to
php
loops
foreach
iteration
php-internals
PHP foreach loop can be used with Indexed arrays
, Associative arrays
and Object public variables
.
In foreach loop, the first thing php does is that it creates a copy of the array which is to be iterated over. PHP then iterates over this new copy
of the array rather than the original one. This is demonstrated in the below example:
<?php
$numbers = [1,2,3,4,5,6,7,8,9]; # initial values for our array
echo '<pre>', print_r($numbers, true), '</pre>', '<hr />';
foreach($numbers as $index => $number){
$numbers[$index] = $number + 1; # this is making changes to the origial array
echo 'Inside of the array = ', $index, ': ', $number, '<br />'; # showing data from the copied array
}
echo '<hr />', '<pre>', print_r($numbers, true), '</pre>'; # shows the original values (also includes the newly added values).
Besides this, php does allow to use iterated values as a reference to the original array value
as well. This is demonstrated below:
<?php
$numbers = [1,2,3,4,5,6,7,8,9];
echo '<pre>', print_r($numbers, true), '</pre>';
foreach($numbers as $index => &$number){
++$number; # we are incrementing the original value
echo 'Inside of the array = ', $index, ': ', $number, '<br />'; # this is showing the original value
}
echo '<hr />';
echo '<pre>', print_r($numbers, true), '</pre>'; # we are again showing the original value
Note: It does not allow original array indexes
to be used as references
.
Source: http://dwellupper.io/post/47/understanding-php-foreach-loop-with-examples
NOTE FOR PHP 7
To update on this answer as it has gained some popularity: This answer no longer applies as of PHP 7. As explained in the "Backward incompatible changes", in PHP 7 foreach works on copy of the array, so any changes on the array itself are not reflected on foreach loop. More details at the link.
Explanation (quote from php.net):
The first form loops over the array given by array_expression. On each iteration, the value of the current element is assigned to $value and the internal array pointer is advanced by one (so on the next iteration, you'll be looking at the next element).
So, in your first example you only have one element in the array, and when the pointer is moved the next element does not exist, so after you add new element foreach ends because it already "decided" that it it as the last element.
In your second example, you start with two elements, and foreach loop is not at the last element so it evaluates the array on the next iteration and thus realises that there is new element in the array.
I believe that this is all consequence of On each iteration part of the explanation in the documentation, which probably means that foreach
does all logic before it calls the code in {}
.
Test case
If you run this:
<?
$array = Array(
'foo' => 1,
'bar' => 2
);
foreach($array as $k=>&$v) {
$array['baz']=3;
echo $v." ";
}
print_r($array);
?>
You will get this output:
1 2 3 Array
(
[foo] => 1
[bar] => 2
[baz] => 3
)
Which means that it accepted the modification and went through it because it was modified "in time". But if you do this:
<?
$array = Array(
'foo' => 1,
'bar' => 2
);
foreach($array as $k=>&$v) {
if ($k=='bar') {
$array['baz']=3;
}
echo $v." ";
}
print_r($array);
?>
You will get:
1 2 Array
(
[foo] => 1
[bar] => 2
[baz] => 3
)
Which means that array was modified, but since we modified it when the foreach
already was at the last element of the array, it "decided" not to loop anymore, and even though we added new element, we added it "too late" and it was not looped through.
Detailed explanation can be read at How does PHP 'foreach' actually work? which explains the internals behind this behaviour.
Great question, because many developers, even experienced ones, are confused by the way PHP handles arrays in foreach loops. In the standard foreach loop, PHP makes a copy of the array that is used in the loop. The copy is discarded immediately after the loop finishes. This is transparent in the operation of a simple foreach loop. For example:
$set = array("apple", "banana", "coconut");
foreach ( $set AS $item ) {
echo "{$item}\n";
}
This outputs:
apple
banana
coconut
So the copy is created but the developer doesn't notice, because the original array isn’t referenced within the loop or after the loop finishes. However, when you attempt to modify the items in a loop, you find that they are unmodified when you finish:
$set = array("apple", "banana", "coconut");
foreach ( $set AS $item ) {
$item = strrev ($item);
}
print_r($set);
This outputs:
Array
(
[0] => apple
[1] => banana
[2] => coconut
)
Any changes from the original can't be notices, actually there are no changes from the original, even though you clearly assigned a value to $item. This is because you are operating on $item as it appears in the copy of $set being worked on. You can override this by grabbing $item by reference, like so:
$set = array("apple", "banana", "coconut");
foreach ( $set AS &$item ) {
$item = strrev($item);
}
print_r($set);
This outputs:
Array
(
[0] => elppa
[1] => ananab
[2] => tunococ
)
So it is evident and observable, when $item is operated on by-reference, the changes made to $item are made to the members of the original $set. Using $item by reference also prevents PHP from creating the array copy. To test this, first we’ll show a quick script demonstrating the copy:
$set = array("apple", "banana", "coconut");
foreach ( $set AS $item ) {
$set[] = ucfirst($item);
}
print_r($set);
This outputs:
Array
(
[0] => apple
[1] => banana
[2] => coconut
[3] => Apple
[4] => Banana
[5] => Coconut
)
As it is shown in the example, PHP copied $set and used it to loop over, but when $set was used inside the loop, PHP added the variables to the original array, not the copied array. Basically, PHP is only using the copied array for the execution of the loop and the assignment of $item. Because of this, the loop above only executes 3 times, and each time it appends another value to the end of the original $set, leaving the original $set with 6 elements, but never entering an infinite loop.
However, what if we had used $item by reference, as I mentioned before? A single character added to the above test:
$set = array("apple", "banana", "coconut");
foreach ( $set AS &$item ) {
$set[] = ucfirst($item);
}
print_r($set);
Results in an infinite loop. Note this actually is an infinite loop, you’ll have to either kill the script yourself or wait for your OS to run out of memory. I added the following line to my script so PHP would run out of memory very quickly, I suggest you do the same if you’re going to be running these infinite loop tests:
ini_set("memory_limit","1M");
So in this previous example with the infinite loop, we see the reason why PHP was written to create a copy of the array to loop over. When a copy is created and used only by the structure of the loop construct itself, the array stays static throughout the execution of the loop, so you’ll never run into issues.
In example 3 you don't modify the array. In all other examples you modify either the contents or the internal array pointer. This is important when it comes to PHP arrays because of the semantics of the assignment operator.
The assignment operator for the arrays in PHP works more like a lazy clone. Assigning one variable to another that contains an array will clone the array, unlike most languages. However, the actual cloning will not be done unless it is needed. This means that the clone will take place only when either of the variables is modified (copy-on-write).
Here is an example:
$a = array(1,2,3);
$b = $a; // This is lazy cloning of $a. For the time
// being $a and $b point to the same internal
// data structure.
$a[] = 3; // Here $a changes, which triggers the actual
// cloning. From now on, $a and $b are two
// different data structures. The same would
// happen if there were a change in $b.
Coming back to your test cases, you can easily imagine that foreach
creates some kind of iterator with a reference to the array. This reference works exactly like the variable $b
in my example. However, the iterator along with the reference live only during the loop and then, they are both discarded. Now you can see that, in all cases but 3, the array is modified during the loop, while this extra reference is alive. This triggers a clone, and that explains what's going on here!
Here is an excellent article for another side effect of this copy-on-write behaviour: The PHP Ternary Operator: Fast or not?
As per the documentation provided by PHP manual.
On each iteration, the value of the current element is assigned to $v and the internal
array pointer is advanced by one (so on the next iteration, you'll be looking at the next element).
So as per your first example:
$array = ['foo'=>1];
foreach($array as $k=>&$v)
{
$array['bar']=2;
echo($v);
}
$array
have only single element, so as per the foreach execution, 1 assign to $v
and it don't have any other element to move pointer
But in your second example:
$array = ['foo'=>1, 'bar'=>2];
foreach($array as $k=>&$v)
{
$array['baz']=3;
echo($v);
}
$array
have two element, so now $array evaluate the zero indices and move the pointer by one. For first iteration of loop, added $array['baz']=3;
as pass by reference.
Some points to note when working with foreach()
:
a) foreach
works on the prospected copy of the original array.
It means foreach()
will have SHARED data storage until or unless a prospected copy
is
not created foreach Notes/User comments.
b) What triggers a prospected copy?
A prospected copy is created based on the policy of copy-on-write
, that is, whenever
an array passed to foreach()
is changed, a clone of the original array is created.
c) The original array and foreach()
iterator will have DISTINCT SENTINEL VARIABLES
, that is, one for the original array and other for foreach
; see the test code below. SPL , Iterators, and Array Iterator.
Stack Overflow question How to make sure the value is reset in a 'foreach' loop in PHP? addresses the cases (3,4,5) of your question.
The following example shows that each() and reset() DOES NOT affect SENTINEL
variables
(for example, the current index variable)
of the foreach()
iterator.
$array = array(1, 2, 3, 4, 5);
list($key2, $val2) = each($array);
echo "each() Original (outside): $key2 => $val2<br/>";
foreach($array as $key => $val){
echo "foreach: $key => $val<br/>";
list($key2,$val2) = each($array);
echo "each() Original(inside): $key2 => $val2<br/>";
echo "--------Iteration--------<br/>";
if ($key == 3){
echo "Resetting original array pointer<br/>";
reset($array);
}
}
list($key2, $val2) = each($array);
echo "each() Original (outside): $key2 => $val2<br/>";
Output:
each() Original (outside): 0 => 1
foreach: 0 => 1
each() Original(inside): 1 => 2
--------Iteration--------
foreach: 1 => 2
each() Original(inside): 2 => 3
--------Iteration--------
foreach: 2 => 3
each() Original(inside): 3 => 4
--------Iteration--------
foreach: 3 => 4
each() Original(inside): 4 => 5
--------Iteration--------
Resetting original array pointer
foreach: 4 => 5
each() Original(inside): 0=>1
--------Iteration--------
each() Original (outside): 1 => 2
Source: Stackoverflow.com