[ Perl tips index ]
[ Subscribe to Perl tips ]
Perl 5.10 was released late last year, and with it come a number of significant improvements to the language. We'll be running a series of Perl tips covering some of the changes, and how you can use them to make your life easier.
Perl 5.10 finally has print with a newline! It's called say, and
can be enabled with:
use feature 'say';
at the top of any program or module that needs it. You can then simple write:
say "Hello World!"; # No \n needed!
rather than:
print "Hello World\n";
While we'll be discussing new functions and constructs in a later
Perl-tip, the say function is so handy we wanted to mention it
before anything else.
One of the largest improvements to Perl 5.10 has been in the area of regular expressions (regexs). To get started, it's now possible to debug your regexs with:
use re 'debug'
$some_string =~ /some_regexp/;
use re 'debug' also existed in Perl 5.8, however its behaviour there
was global, resulting in debugging information for all your regexs.
In 5.10 the pragma has lexical scope, meaning it lasts only until the
end of the current block, file, or eval.
{
use re 'debug';
$some_string =~ /some_regexp/; # This gets debugged.
}
$some_string =~ /some_other_regexp/; # This isn't debugged.
There's also no re 'debug' to turn off regex debugging, without
having to play around with blocks:
use re 'debug';
$some_string =~ /some_regexp/; # This gets debugged.
no re 'debug';
$some_string =~ /some_other_regexp/; # This isn't debugged.
We've always been able to capture information from regular expressions
using parentheses, and recalling them using the match variables
$1, $2, $3.... However sometimes it can be rather challenging to
tell which match variable you want.
This can be doubly challenging when we interpolate smaller regexps into bigger ones. For example, what match variable will the last sequence of digits be placed into in the following expression?
/ (\d+) $customer_name_regexp (\d+) /x;
Keep in mind that $customer_name_regexp may or may not contain
parentheses itself.
In Perl 5.10 we can now have named captures. This means we can write:
/ (?<account>\d+) $customer_name_regexp (?<credit>\d+) /x;
Using (?<name>...) syntax allows us to capture a match and then
later refer to it by name. We can also refer to it by its regular
match number, so our account match above can still be referred to
as $1.
In order to retrieve named match information, we can use the special
hash %+:
say "Customer account number is $+{account}";
say "Customer credit balance is $+{credit}";
Named captures can also be used in substitutions, using the new
\k sequence. For example, we can swap the first and last words
on a line (ignoring punctuation) using:
s{
^
(?<first> \w+)
(?<middle> .* ) \b
(?<last> \w+)
$
}
{\k<last>\k<middle>\k<first>}x;
The special regex variables $`, $& and $' would match everything
before, inside, and after a regex respectively. However they came at
a great cost; mentioning one of these special variables anywhere
in your program would turn them on for all your regular expressions;
even those that didn't need them. As such, the use of these variables
are strongly discouraged in all but the most simple of programs.
However they can be very useful. There are some algorithms that really appreciate knowing everything that was before or after a given match.
In Perl 5.10 there's a new regexp modifier, /p, that gives us all
the conveience of the old $`, $& and $' variables, but without
the global performance penalty. Here's how it works:
/(foo|bar|baz)/p;
say "Everything before the match: ${^PREMATCH}";
say "Everything inside the match: ${^MATCH}";
say "Everything after the match: ${^POSTMATCH}";
The ${^PREMATCH}, ${^MATCH} and ${^POSTMATCH} variables are
only set when the /p switch is used.
This tip only reveals some of the improvements made to the regexp engine in Perl 5.10. A lot of advanced features have been added, and a lot of new optimisations and improvements have been made under-the-hood.
For further information, we recommend the following resources:
Perl 5.10 Advanced Regular Expressions by Yves Orton http://www.regex-engineer.org/slides/perl510_regex.html
Perldelta - What is new for Perl 5.10.0 http://search.cpan.org/~rgarcia/perl-5.10.0/pod/perl5100delta.pod
perlre from the Perl 5.10 distribution
http://search.cpan.org/~rgarcia/perl-5.10.0/pod/perlre.pod
[ Perl tips index ]
[ Subscribe to Perl tips ]
This Perl tip and associated text is copyright Perl Training Australia. You may freely distribute this text so long as it is distributed in full with this Copyright noticed attached.
If you have any questions please don't hesitate to contact us:
| Email: | contact@perltraining.com.au |
| Phone: | 03 9354 6001 (Australia) |
| International: | +61 3 9354 6001 |
Copyright 2001-2012 Perl Training Australia. Contact us at contact@perltraining.com.au