Ordered hashes

[ Perl tips index ]
[ Subscribe to Perl tips ]

At Perl Training Australia we're always happy to answer simple Perl questions. One question we're frequently asked involves hashes, and goes something like this:

I've created a hash to map the months of the year to the number of days they each have. I know that I'm putting the values in to the hash in the correct order but when I iterate over the keys, they come out differently. How can I get them to come out in month order?

Perl hashes have no internal order. Under the hood, a hash consists of a number of ``buckets''. When a record is inserted into the hash, the key is transformed, using a ``hash function'', into a bucket number. The details of the hash function itself are not important, but a good hash function ensures that even very similar keys are mapped to different buckets. If too many keys map to the same bucket, then Perl has to spend a lot of time searching through that bucket for the corresponding value.

The important thing is not how Perl puts the key into the hash table but what the table gives us. To find if a value is in a hash, Perl takes the key name supplied, applies the same transformation and checks to see whether the key is in that position. This is much faster than searching through an ordered list.

However, while we often appreciate the speed in which hashes work, sometimes we want to maintain key order as well. A common solution is to create both our hash and an array. The array stores the keys in sorted order and we can then iterate through our has as required:

        my @months = qw(January February ... );
        
        my %days_in = ( January => 31, February => 28, ... );
        foreach my $month ( @months ) {
                print "$month has $days_in{$month} days\n";
        }

If @months and %days_in are created in one place in our program and then never changed (such as could be expected for a list of months of the year) then this is a very good solution. However if our hash is likely to grow over time then we can encounter problems. What happens if a key is added to one structure but not the other?

Fortunately there are better solutions.

Tie::InsertOrderHash

This module is available from the Comprehensive Perl Archive Network (CPAN) http://search.cpan.org.

If you wish to preserve the insert-order of your elements in your hash then Tie::InsertOrderHash may be the tool for you. It's usage is very simple:

        use Tie::InsertOrderHash;
        tie my %days_in => 'Tie::InsertOrderHash',
                January   => 31,
                February  => 28,
                March     => 31,
                April     => 30,
                May       => 31,
                June      => 30,
                July      => 31,
                August    => 31,
                September => 30,
                October   => 31,
                November  => 30,
                December  => 31
        ;

        print join(" ", keys %days_in), "\n";

        # prints: January February March April May June July August
        # September October November December

In the above example, we're setting the hash at creation time. This is not required. We can add our values to our hash whenever we desire, just like a normal hash, further, our values can be any kind of value we can normally use in a hash.

        tie my %friends => 'Tie::InsertOrderHash',
                Paul => 'likes scuba diving';

        $friends{Daniel} => 'travels around the world';

        tie my %restaurant_rating => 'Tie::InsertOrderHash';

        $restaurant_rating{'Enlightened Cuisine'} = 9;

Note that these hashes only preserve insert-order. Thus if we later change February in our our days_in hash to have 29 days because of a leap year, February will be moved to the end of the list.

Tie::IxHash;

This module is available from the Comprehensive Perl Archive Network (CPAN).

Tie::InsertOrderHash does not allow you to change values in your hash and keep the original ordering. Thus, as mentioned above, updating the days in February will cause February to now appear after December when listing the keys.

If you want to be able to change values without changing the ordering then Tie::IxHash may be a better option.

        use Tie::IxHash;

        tie my %days_in => 'Tie::IxHash',
                January   => 31,
                February  => 28,
                March     => 31,
                April     => 30,
                May       => 31,
                June      => 30,
                July      => 31,
                August    => 31,
                September => 30,
                October   => 31,
                November  => 30,
                December  => 31
         ;

        # And later
        $days_in{February} = 29;

        print join(" ", keys %days_in), "\n";

        # prints: January February March April May June July August
        # September October November December

There is also Tie::Hash::Indexed which provides less features than Tie::IxHash, but is considerably faster.

To get a similar behaviour to Tie::InsertOrderHash you can delete a key-value pair from your hash before providing the new value. Thus if we were to write:

        delete($days_in{June});
        $days_in{June} = 3;

then June would be considered to have been inserted last.

[ Perl tips index ]
[ Subscribe to Perl tips ]


This Perl tip and associated text is copyright Perl Training Australia. You may freely distribute this text so long as it is distributed in full with this Copyright noticed attached.

If you have any questions please don't hesitate to contact us:

Email: contact@perltraining.com.au
Phone: 03 9354 6001 (Australia)
International: +61 3 9354 6001

Valid XHTML 1.0 Valid CSS