3

I need to retrieve some word from url :

WebViewActivity - https://google.com/search/?term=iphone_5s&utm_source=google&utm_campaign=search_bar&utm_content=search_submit

return I want :

search/iphone_5s

but I'm stuck and not really understand how to use regexp_substr to get that data.

I'm trying to use this query

regexp_substr(web_url, '\google.com/([^}]+)\/', 1,1,null,1)

which only return the 'search' word, and when I try

regexp_substr(web_url, '\google.com/([^}]+)\&', 1,1,null,1)

it turns out I get all the word until the last '&'

1 Answer 1

3

You may use a REGEXP_REPLACE to match the whole string but capture two substrings and replace with two backreferences to the capture group values:

REGEXP_REPLACE(
    'WebViewActivity - https://google.com/search/?term=iphone_5s&utm_source=google&utm_campaign=search_bar&utm_content=search_submit',
    '.*//google\.com/([^/]+/).*[?&]term=([^&]+).*',
    '\1\2')

See the regex demo and the online Oracle demo.

Pattern details

  • .* - any zero or more chars other than line break chars as many as possible
  • //google\.com/ - a //google.com/ substring
  • ([^/]+/) - Capturing group 1: one or more chars other than / and then a /
  • .* - any zero or more chars other than line break chars as many as possible
  • [?&]term= - ? or & and a term= substring
  • ([^&]+) - Capturing group 2: one or more chars other than &
  • .* - any zero or more chars other than line break chars as many as possible

NOTE: To use this approach and get an empty result if the match is not found, append |.+ at the end of the regex pattern.

Sign up to request clarification or add additional context in comments.

4 Comments

Is there a risk to catch an URL such as https://example.com/?redirect_to=https://google.com/search/?term=iphone_5s ? I'm note 100% sure due to how the URL would be encoded.
thank you, it works. but what if the web url becomes google.co.uk ?
@BenoîtZu Sure this string will get matched, but that can be easily fixed if there is a clear set of requirements for the input string. Here is a variation of the current approach.
@DedeSoetopo Replace com with [^/]+, see this demo.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.