4

When I make a regex variable with capturing groups, the whole match is OK, but capturing groups are Nil.

my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;

$str ~~ / $rgx / ;
say ~$/;  # 12abc34
say $0;   # Nil
say $1;   # Nil

If I modify the program to avoid $rgx, everything works as expected:

my $str = 'nn12abc34efg';

my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;

$str ~~ / ($atom) \w+ ($atom) /;
say ~$/;  # 12abc34
say $0;   # 「12」
say $1;   # 「34」
3
  • 2
    Interesting question. I am not sure why this happens, but you could make $rgx a named regex using e.g. my regex rgx { ($atom) \w+ ($atom) }. Then after $str ~~ / <rgx>/ we would have that $<rgx>[0] represents the first capture group (for example). Commented Oct 13, 2017 at 9:16
  • 1
    See also How can I interpolate a variable into a Perl 6 regex? Commented Oct 13, 2017 at 9:24
  • Thanks!! Didn't know about named regexes. Commented Oct 13, 2017 at 9:31

2 Answers 2

5

With your code, the compiler gives the following warning:

Regex object coerced to string (please use .gist or .perl to do that)

That tells us something is wrong—regex shouldn't be treated as strings. There are two more proper ways to nest regexes. First, you can include sub-regexes within assertions(<>):

my $str = 'nn12abc34efg';
my Regex $atom = / \d ** 2 /;
my Regex $rgx = / (<$atom>) \w+ (<$atom>) /;
$str ~~ $rgx;

Note that I'm not matching / $rgx /. That is putting one regex inside another. Just match $rgx.

The nicer way is to use named regexes. Defining atom and the regex as follows will let you access the match groups as $<atom>[0] and $<atom>[1]:

my regex atom { \d ** 2 };
my $rgx = / <atom> \w+ <atom> /;
$str ~~ $rgx;
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you for the perfect answer! My understanding of p6 regex syntax, especially the use of <> is rather vague.
@evb Glad I helped. I actually don't know why the original code didn't work. I speculate it's because of how the three regular expressions are composed, and I wonder whether the match group is being set then unset as a nested regex is matched. Perhaps it is a bug in rakudo, since nesting doesn't unset matches in the other two variations. But the fact that the compiler warned us lets it off the hook in my book.
I've tried your 2nd solution with (<$atom>) and it still doesn't work — both $0 and $1 are Nil.
@evb Ahh, I left out one important detail—when you write / $rgx /, you're putting the regex inside a regex. Don't do that. Match as: $str ~~ $rgx.
And I fixed the capitalization error, and removed the first paragraph (which was plain wrong). Very sorry about the mistakes.
|
4

The key observation is that $str ~~ / $rgx /; is a "regex inside of a regex". $rgx matched as it should and set $0 and $1 within it's own Match object, but then there was no where within the surrounding match object to store that information, so you couldn't see it. Maybe it's clear with an example, try this:

my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;

$str ~~ / $0=$rgx /;
say $/;

Note the contents of $0. Or as another example, let's give it a proper name:

my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;

$str ~~ / $<bits-n-pieces>=$rgx /;
say $/;

1 Comment

Thank you! Yes, I noted $0 in the 1st variant. So, the problem was the absence of a suitable match object...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.