3

I have a Perl script in which I perform web service calls in a loop. The server returns a multivalued HTTP header that I need to parse after each call with information that I will need to make the next call (if it doesn't return the header, I want to exit the loop).

I only care about one of the values in the header, and I need to get the information out of it with a regular expression. Let's say the header is like this, and I only care about the "foo" value:

X-Header: test-abc12345; blah=foo
X-Header: test-fgasjhgakg; blah=bar

I can get the header values like this: @values = $response->header( 'X-Header' );. But how do I quickly check if

  1. There is a foo value, and
  2. Parse and save the foo value for the next iteration?

Ideally, I'd like to do something like this:

my $value = 'default';

do {
  # (do HTTP request; use $value)
  @values = $response->header( 'X-Header' );
} while( $value = first { /(?:test-)([^;]+)(?:; blah=foo)/ } @values );

But grep, first (from List::Util), etc. all return the entire match and not just the single capturing group I want. I want to avoid cluttering up my code by looping over the array and matching/parsing inside the loop body.

Is what I want possible? What would be the most compact way to write it? So far, all I can come up with is using lookarounds and \K to discard the stuff I don't care about, but this isn't super readable and makes the regex engine perform a lot of unnecessary steps.

0

3 Answers 3

3

So it seems that you want to catch the first element with a certain pattern, but acquire only the pattern. And you want it done nicely. Indeed, first and grep only pass the element itself.

However, List::MoreUtils::first_result does support processing of its match

use List::MoreUtils 0.406 qw(first_result);

my @w = qw(a bit c dIT);  # get first "it" case-insensitive

my $res = first_result { ( /(it)/i )[0] } @w;

say $res // 'undef';  #--> it

That ( ... )[0] is needed to put the regex in the list context so that it returns the actual capture. Another way would be firstres { my ($r) = /(it)/i; $r }. Pick your choice


For the data in the question

use warnings;
use strict;
use feature 'say';

use List::MoreUtils 0.406 qw(firstres);

my @data = ( 
    'X-Header: test-abc12345; blah=foo',
    'X-Header: test-fgasjhgakg; blah=bar'
);

if (my $r = firstres { ( /test-([^;]+);\s+blah=foo/ )[0] } @data) {
    say $r
}

Prints abc12345, clarified in a comment to be the sought result.


Module versions prior to 0.406 (of 2015-03-03) didn't have firstres (alias first_result)

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks! first_result sounds like what I was thinking of, although I think map is probably the more elegant solution here (can't believe I didn't think of that earlier...) first_result will definitely be something I'll use in future scripts though. I appreciate all your help!
@flashbang Yeah, that's what your question asks for really -- stop at first match but extract a pattern from that element. Processing input is natural with map, unlike what grep and first do, and that was the first thing that came to mind. But I left it out altogther because you specifically wanted to avoid "extras" -- code, processing steps, etc. And map must, always, pass over the whole list. Efficiency wise that makes practically no difference with lists of a few items but firstres just fills the bill more fully. Also, it isn't mentioned often while it can be very useful.
(Also, efficiency wise, firstres must do extra bookkeeping/testing to be able to bail out at first match, so for very short lists it may not be faster at all.)
Appearance wise, the code for firstres and map looks exactly the same -- you must explicitly return from the block what you what the operation to return. The only difference is that firstres bails out on first match, so there's nothing to do after the fact so it is in fact cleaner looking. I'd guess that map seems more elegant due to habit and common use -- it is a well known and loved builtin meant precisely for processing elements of a list.
2

first { ... } @values returns one the values (or undef).

You could use either of these:

my ($value) = map { /...(...).../ } @values;

my $value = ( map { /...(...).../ } @values ) ? $1 : undef;

my $value = ( map { /...(...).../ } @values )[0];

Using first, it would look like the following, which is rather silly:

my $value = first { 1 } map { /...(...).../ } @values;

However, assuming the capture can't be an empty string or the string 0, List::MoreUtils's first_result could be used to avoid the unnecessary matches:

my $value = first_result { /...(...).../ ? $1 : undef } @values;

my $value = first_result { ( /...(...).../ )[0] } @values;

If the returned value can be false (e.g. an empty string or a 0) you can use something like

my $value = first_result { /...(...).../ ? \$1 : undef } @values;
$value = $$value if $value;

The first_result approach isn't necessarily faster in practice.

1 Comment

Thanks! The map solution seems like the most elegant one in this scenario in terms of readability - since the list will never be more than a few elements I'm not worried about looping unnecessarily. first_result sounds like what I was asking for, even if it's not what I ended up going with here :)
1

Following code snippet is looking for foo stored in a variable $find, the found values is stored in variable $found.


my $find = 'foo';
my $found;

while( $response->header( 'X-Header' ) ) {
    if( /X-Header: .*?blah=($find)/ ) {
        $found = $1;
        last;
    }
}

say $found if $found;

Sample demo code

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my $find = 'foo';
my $found;
my @header = <DATA>;

chomp(@header);

for ( @header ) {
    $found = $1 if /X-Header: .*?blah=($find)/;
    last if $found;
}

say Dumper(\@header);
say "Found: $found" if $found;

__DATA__
X-Header: test-abc12345; blah=foo
X-Header: test-fgasjhgakg; blah=bar

Output

$VAR1 = [
          'X-Header: test-abc12345; blah=foo',
          'X-Header: test-fgasjhgakg; blah=bar'
        ];

Found: foo

4 Comments

Unfortunately, X-Header is returned for all responses - I need to check if a foo value was provided to know whether to continue or not.
Yes, I'm using LWP - sorry, should have included that I suppose. I didn't think getting into the specifics would be very helpful since this API is not public, but the header method of an HTTP::Response object should work the same anywhere.
Please see sample demo code and it's output. I utilized __DATA__ block to simulate X-Header.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.