8

I am trying to find a way to let me dynamically create a regexp object from a string (taken from the database) and then use that to filter another string. This example is to extract data from a git commit message, but in theory any valid regexp could be present in the database as a string.

What happens

>> string = "[ALERT] Project: Revision ...123456 committed by Me <[email protected]>\n on 2009-   07-28 21:21:47\n\n    Fixed typo\n"
>> r = Regexp.new("[A-Za-z]+: Revision ...[\w]+ committed by [A-Za-z\s]+")
>> string[r]
=> nil

What I want to happen

>> string = "[ALERT] Project: Revision ...123456 committed by Me <[email protected]>\n on 2009-   07-28 21:21:47\n\n    Fixed typo\n"
>> string[/[A-Za-z]+: Revision ...[\w]+ committed by [A-Za-z\s]+/]
=> "Project: Revision 123456 committed by Me"

2 Answers 2

12

You're only missing one thing:

>> Regexp.new "\w"
=> /w/
>> Regexp.new "\\w"
=> /\w/

Backslashes are escape characters in strings. If you want a literal backslash you have to double it.

>> string = "[ALERT] Project: Revision ...123456 committed by Me <[email protected]>\n on 2009-   07-28 21:21:47\n\n    Fixed typo\n"
=> "[ALERT] Project: Revision ...123456 committed by Me <[email protected]>\n on 2009-   07-28 21:21:47\n\n    Fixed typo\n"
>> r = Regexp.new("[A-Za-z]+: Revision ...[\\w]+ committed by [A-Za-z\\s]+")
=> /[A-Za-z]+: Revision ...[\w]+ committed by [A-Za-z\s]+/
>> string[r]
=> "Project: Revision ...123456 committed by Me "

Typically, if you'd pasted the output from your "broken" lines, rather than just the input, you'd probably have spotted that the w and s weren't escaped properly

Sign up to request clarification or add additional context in comments.

1 Comment

Perfect, thanks - I knew I had to be doing something subtly wrong.
0

Option 1:

# Escape the slashes:
r = Regexp.new("[A-Za-z]+: Revision ...[\\w]+ committed by [A-Za-z\\s]+")

Disadvantage: manually escape all known escape characters

Option 2:

# Use slashes in constructor
r = Regexp.new(/[A-Za-z]+: Revision ...[\w]+ committed by [A-Za-z\s]+/)

Disadvantage: None

1 Comment

For option 2 - the argument to the constructor is always string because the regex is being pulled from the database so that won't work in this scenario.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.