Subsecciones

Definición de Nombres de Patrones

Perl 5.10 introduce la posibilidad de definir subpatrones en una sección del patrón.

Lo que dice perlretut sobre la definición de nombres de patrones

Citando la sección Defining named patterns en el documento la sección 'Defining-named-patterns' en perlretut para perl5.10:

Some regular expressions use identical subpatterns in several places. Starting with Perl 5.10, it is possible to define named subpatterns in a section of the pattern so that they can be called up by name anywhere in the pattern. This syntactic pattern for this definition group is "(?(DEFINE)(?<name>pattern)...)" An insertion of a named pattern is written as (?&name).

Veamos un ejemplo que define el lenguaje de los números en punto flotante:

pl@nereida:~/Lperltesting$ cat -n definingnamedpatterns.pl
 1  #!/usr/local/lib/perl/5.10.1/bin//perl5.10.1 -w
 2  use v5.10;
 3
 4  my $regexp = qr{
 5     ^ (?<num>
 6               (?&osg)[\t\ ]* (?: (?&int)(?&dec)? | (?&dec) )
 7       )
 8       (?: [eE]
 9       (?<exp> (?&osg)(?&int)) )?
10     $
11        (?(DEFINE)
12         (?<osg>[-+]?)         # optional sign
13         (?<int>\d++)          # integer
14         (?<dec>\.(?&int))     # decimal fraction
15        )
16  }x;
17
18  my $input = <>;
19  chomp($input);
20  my @r;
21  if (@r = $input =~ $regexp) {
22    my $exp = $+{exp} || '';
23    say "$input matches: (num => '$+{num}', exp => '$exp')";
24  }
25  else {
26    say "does not match";
27  }
perlretut comenta sobre este ejemplo:
The example above illustrates this feature. The three subpatterns that are used more than once are the optional sign, the digit sequence for an integer and the decimal fraction. The DEFINE group at the end of the pattern contains their definition. Notice that the decimal fraction pattern is the first place where we can reuse the integer pattern.

Lo que dice perlre sobre la definición de patrones

Curiosamente, (DEFINE) se considera un caso particular de las expresiones regulares condicionales de la forma (?(condition)yes-pattern) (véase la sección 1.2.10). Esto es lo que dice la sección 'Extended-Patterns' en perlre al respecto:

A special form is the (DEFINE) predicate, which never executes directly its yes-pattern, and does not allow a no-pattern. This allows to define subpatterns which will be executed only by using the recursion mechanism. This way, you can define a set of regular expression rules that can be bundled into any pattern you choose.

It is recommended that for this usage you put the DEFINE block at the end of the pattern, and that you name any subpatterns defined within it.

Also, it's worth noting that patterns defined this way probably will not be as efficient, as the optimiser is not very clever about handling them.

An example of how this might be used is as follows:

   1. /(?<NAME>(?&NAME_PAT))(?<ADDR>(?&ADDRESS_PAT))
   2.        (?(DEFINE)
   3.          (?<NAME_PAT>....)
   4.          (?<ADRESS_PAT>....)
   5. )/x

Note that capture buffers matched inside of recursion are not accessible after the recursion returns, so the extra layer of capturing buffers is necessary. Thus $+{NAME_PAT} would not be defined even though $+{NAME} would be.

Lo que dice perlvar sobre patrones con nombre

Esto es lo que dice perlvar respecto a las variables implicadas %+ y %-. Con respecto a el hash %+:

Casiano Rodríguez León
2009-12-09