[perl] What's the safest way to iterate through the keys of a Perl hash?

If I have a Perl hash with a bunch of (key, value) pairs, what is the preferred method of iterating through all the keys? I have heard that using each may in some way have unintended side effects. So, is that true, and is one of the two following methods best, or is there a better way?

# Method 1
while (my ($key, $value) = each(%hash)) {
    # Something
}

# Method 2
foreach my $key (keys(%hash)) {
    # Something
}

This question is related to perl hash iteration each

The answer is


Using the each syntax will prevent the entire set of keys from being generated at once. This can be important if you're using a tie-ed hash to a database with millions of rows. You don't want to generate the entire list of keys all at once and exhaust your physical memory. In this case each serves as an iterator whereas keys actually generates the entire array before the loop starts.

So, the only place "each" is of real use is when the hash is very large (compared to the memory available). That is only likely to happen when the hash itself doesn't live in memory itself unless you're programming a handheld data collection device or something with small memory.

If memory is not an issue, usually the map or keys paradigm is the more prevelant and easier to read paradigm.


I always use method 2 as well. The only benefit of using each is if you're just reading (rather than re-assigning) the value of the hash entry, you're not constantly de-referencing the hash.


I usually use keys and I can't think of the last time I used or read a use of each.

Don't forget about map, depending on what you're doing in the loop!

map { print "$_ => $hash{$_}\n" } keys %hash;

I may get bitten by this one but I think that it's personal preference. I can't find any reference in the docs to each() being different than keys() or values() (other than the obvious "they return different things" answer. In fact the docs state the use the same iterator and they all return actual list values instead of copies of them, and that modifying the hash while iterating over it using any call is bad.

All that said, I almost always use keys() because to me it is usually more self documenting to access the key's value via the hash itself. I occasionally use values() when the value is a reference to a large structure and the key to the hash was already stored in the structure, at which point the key is redundant and I don't need it. I think I've used each() 2 times in 10 years of Perl programming and it was probably the wrong choice both times =)


The place where each can cause you problems is that it's a true, non-scoped iterator. By way of example:

while ( my ($key,$val) = each %a_hash ) {
    print "$key => $val\n";
    last if $val; #exits loop when $val is true
}

# but "each" hasn't reset!!
while ( my ($key,$val) = each %a_hash ) {
    # continues where the last loop left off
    print "$key => $val\n";
}

If you need to be sure that each gets all the keys and values, you need to make sure you use keys or values first (as that resets the iterator). See the documentation for each.


I woudl say:

  1. Use whatever's easiest to read/understand for most people (so keys, usually, I'd argue)
  2. Use whatever you decide consistently throught the whole code base.

This give 2 major advantages:

  1. It's easier to spot "common" code so you can re-factor into functions/methiods.
  2. It's easier for future developers to maintain.

I don't think it's more expensive to use keys over each, so no need for two different constructs for the same thing in your code.


One thing you should be aware of when using each is that it has the side effect of adding "state" to your hash (the hash has to remember what the "next" key is). When using code like the snippets posted above, which iterate over the whole hash in one go, this is usually not a problem. However, you will run into hard to track down problems (I speak from experience ;), when using each together with statements like last or return to exit from the while ... each loop before you have processed all keys.

In this case, the hash will remember which keys it has already returned, and when you use each on it the next time (maybe in a totaly unrelated piece of code), it will continue at this position.

Example:

my %hash = ( foo => 1, bar => 2, baz => 3, quux => 4 );

# find key 'baz'
while ( my ($k, $v) = each %hash ) {
    print "found key $k\n";
    last if $k eq 'baz'; # found it!
}

# later ...

print "the hash contains:\n";

# iterate over all keys:
while ( my ($k, $v) = each %hash ) {
    print "$k => $v\n";
}

This prints:

found key bar
found key baz
the hash contains:
quux => 4
foo => 1

What happened to keys "bar" and baz"? They're still there, but the second each starts where the first one left off, and stops when it reaches the end of the hash, so we never see them in the second loop.


A few miscellaneous thoughts on this topic:

  1. There is nothing unsafe about any of the hash iterators themselves. What is unsafe is modifying the keys of a hash while you're iterating over it. (It's perfectly safe to modify the values.) The only potential side-effect I can think of is that values returns aliases which means that modifying them will modify the contents of the hash. This is by design but may not be what you want in some circumstances.
  2. John's accepted answer is good with one exception: the documentation is clear that it is not safe to add keys while iterating over a hash. It may work for some data sets but will fail for others depending on the hash order.
  3. As already noted, it is safe to delete the last key returned by each. This is not true for keys as each is an iterator while keys returns a list.

Examples related to perl

The program can't start because api-ms-win-crt-runtime-l1-1-0.dll is missing while starting Apache server on my computer "End of script output before headers" error in Apache Perl - Multiple condition if statement without duplicating code? How to decrypt hash stored by bcrypt Split a string into array in Perl Turning multiple lines into one comma separated line String compare in Perl with "eq" vs "==" how to remove the first two columns in a file using shell (awk, sed, whatever) Find everything between two XML tags with RegEx Difference between \w and \b regular expression meta characters

Examples related to hash

php mysqli_connect: authentication method unknown to the client [caching_sha2_password] What is Hash and Range Primary Key? How to create a laravel hashed password Hashing a file in Python PHP salt and hash SHA256 for login password Append key/value pair to hash with << in Ruby Are there any SHA-256 javascript implementations that are generally considered trustworthy? How do I generate a SALT in Java for Salted-Hash? What does hash do in python? Hashing with SHA1 Algorithm in C#

Examples related to iteration

Is there a way in Pandas to use previous row value in dataframe.apply when previous value is also calculated in the apply? How to loop over grouped Pandas dataframe? How to iterate through a list of dictionaries in Jinja template? How to iterate through an ArrayList of Objects of ArrayList of Objects? Ways to iterate over a list in Java Python list iterator behavior and next(iterator) How to loop through an array containing objects and access their properties recursion versus iteration What is the perfect counterpart in Python for "while not EOF" How to iterate over a JavaScript object?

Examples related to each

jQuery animated number counter from zero to value How to iterate over array of objects in Handlebars? Excel VBA For Each Worksheet Loop jQuery looping .each() JSON key/value not working jQuery '.each' and attaching '.click' event How to use continue in jQuery each() loop? jQuery .each() with input elements C++ for each, pulling from vector elements jQuery each loop in table row Get href attribute on jQuery