0

I am using Perl and grep command to look for pattern in an array output. I am interested in searching for the following text in an array:

/tmp/12345.hash

The 12345 could be any sequence of numbers like 234 or 567889 but the /tmp/ and the .hash will be consistent. I am not great with regex therefore not sure how to build the proper regex statement.

@line = grep /hash/, @exp;

My original search was to look just the word hash but that matched on other lines and i ended up with the wrong result.

0

1 Answer 1

4

Regex allows you to encode requirements into a "pattern" with a lot more precision:

my @filtered = grep { m{^/tmp/[0-9]+\.hash$} } @all;

I use {} for delimiters since with the usual // ones every / in the pattern must be escaped. Then m is required in front (unlike with // delimiters where it may be omitted).

The anchor ^ matches the beginning of the string (and at yet other positions if the /m "modifier" is in effect). The /tmp seems to be the beginning of a path, but if there were (for example) leading spaces before it then the above wouldn't match (unless you change it to ^\s*/tmp so to allow for optional spaces). Consider your data carefully.

The $ matches the end of the string, or before the newline at the end if there is one (/m modifier changes this). To also match strings with more characters after hash remove the $.

The pattern itself sets down what you say in the problem description: there must be an integer, which varies, and the rest is fixed.

Perl's own (excellent) documentation comes with the tutorial perlretut.


  With the modifier, $str =~ /.../m, the string is treated as a multi-line string so that if there are linefeeds in it then ^ and $ in that regex match the beginning and end of each line as well.

The anchor to always match only the end of the string is \z (also see \Z which matches like $ but is insensitive to /m). See Assertions in perlre, and see answers on this page.

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for the quick response...I am testing it now..I noticed the missing closing } but looks like you fixed it. Thank you for the explanation, very helpful.
@user3521305 Ah, yes, there was a typo (or a couple?) in what I initially posted -- have fixed them, and edited more. Let me know if things are unclear.
\z matches the end of the string. $ matches more than the end of the string.
hum? You're still using $, and you still claim it matches the end of the string. Even without /m, $ doesn't just match the end of the string. $ is equivalent to (?=\n\z)|\z without /m, and it's it's equivalent to (?=\n)|\z) with /m.
@ikegami OK, you mean that it matches before the newline if there is one at the end (like \Z) so not at the end. Thank you again. (It's interesting that this is never distinguished, except in docs.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.