1

I'm trying to convert the following python regex to ruby

match = re.search(r'window.__APOLLO_STATE__ = JSON.parse\("(.+?)"\);', body)

I've done some digging and Regexp#match should be what i'm looking for but the following is returning nil.

resp.body.match('^window.__APOLLO_STATE__ = JSON.parse\("(.+?)"\)')

How can I convert the regex and where am I wrong?

6
  • 1
    You probably want to pass a Regexp object to match rather than a string, i.e. /^window.__APOLLO_.../. Commented Dec 4, 2019 at 23:04
  • Do you need to get what is inside the quotes? Commented Dec 4, 2019 at 23:14
  • yes @WiktorStribiżew Commented Dec 4, 2019 at 23:16
  • 1
    Use resp.body[/window\.__APOLLO_STATE__ = JSON\.parse\("(.*?)"\);/, 1], see ideone.com/sJth9I Commented Dec 4, 2019 at 23:16
  • thank you @WiktorStribiżew :) Commented Dec 4, 2019 at 23:37

3 Answers 3

1

You may use

resp.body[/window\.__APOLLO_STATE__ = JSON\.parse\("(.*?)"\);/, 1]

Here,

  • /.../ is a regex literal notation that is very convenient when defining regex patterns
  • Literal dots are escaped, else, they match any char but line break chars
  • The .+? is changed to .*? to be able to match empty values (else, you may overmatch, it is easier to later discard empty matches than fix overmatches)
  • 1 tells the engine to return the value of the capturing group with ID 2 of the first match. If you need multiple matches, use resp.body.scan(/regex/).
Sign up to request clarification or add additional context in comments.

Comments

0

An idiomatic way is to use the =~ regex match operator:

resp.body =~ /^window.__APOLLO_STATE__ = JSON.parse\("(.+?)"\)/

You can access the capture groups with $1, $2, and so on.

If you don't like the global variable usage, you can also use the Regexp#match method

result = /^window.__APOLLO_STATE__ = JSON.parse\("(.+?)"\)/.match(resp.body)
result[1] # => returns first capture group

Comments

0

As I understand, your string is something like

str = 'window.__APOLLO_STATE__ = JSON.parse("my dog has fleas");'

and you wish to extract the text between the double quotes. You can do that with the following regular expression, which does not employ a capture group:

r = /\Awindow\.__APOLLO_STATE__ = JSON\.parse\(\"\K.+?(?=\"\);\z)/

str[r]
  #=> "my dog has fleas"

The regular expression can be written in free-spacing mode to make it self-documenting:

r = /
    \A          # match beginning of string
    window\.__APOLLO_STATE__\ =\ JSON\.parse\(\"
                # match substring
    \K          # discard everything matched so far 
    .+?         # match 1+ characters, lazily
    (?=\"\);\z) # match "); followed by end-of-string (positive lookahead)
    /x          # free-spacing regex definition mode

The contents of a positive lookahead must be matched but are not part of the match returned. Neither is the text matched prior to the \K directive part of the match returned.

Free-spacing mode removes all whitespace before the expression is parsed. Accordingly, any intended spaces (in "APOLLO_STATE__ = JSON", for example) must be protected. I've done that by escaping the spaces, one of several ways that can be done.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.